code DB_File-Lock2.pm My-DB.pm lwp-bench.pl mysql-3.22.29_backup.pl mysql-3.22.29_restore.pl mysql-3.22.30+_backup.pl mysql-3.22.30+_restore.pl

stas 6 Jan 2002 16:55:00 -0000

stas        02/01/06 08:54:58

  Modified:    src/docs config.cfg
  Added:       src/docs/1.0/guide CHANGES advocacy.pod browserbugs.pod
                        config.cfg config.pod control.pod
                        correct_headers.pod databases.pod dbm.pod debug.pod
                        download.pod frequent.pod hardware.pod help.pod
                        install.pod intro.pod modules.pod multiuser.pod
                        performance.pod perl.pod porting.pod scenario.pod
                        security.pod snippets.pod start.pod strategy.pod
                        style.css troubleshooting.pod
               src/docs/1.0/guide/code DB_File-Lock2.pm My-DB.pm
                        lwp-bench.pl mysql-3.22.29_backup.pl
                        mysql-3.22.29_restore.pl mysql-3.22.30+_backup.pl
                        mysql-3.22.30+_restore.pl
  Log:
  - porting the 1.x guide
  
  Revision  Changes    Path
  1.2       +1 -1      modperl-docs/src/docs/config.cfg
  
  Index: config.cfg
  ===================================================================
  RCS file: /home/cvs/modperl-docs/src/docs/config.cfg,v
  retrieving revision 1.1
  retrieving revision 1.2
  diff -u -r1.1 -r1.2
  --- config.cfg        5 Jan 2002 19:26:01 -0000       1.1
  +++ config.cfg        6 Jan 2002 16:54:56 -0000       1.2
  @@ -10,7 +10,7 @@
         group => 'mod_perl 1.x Documentation',
   
         docsets => [
  -                #  '1.0/guide',
  +                  '1.0/guide',
                     '1.0/win32',
                     '1.0/faqs',
                     '1.0/api',
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/CHANGES
  
  Index: CHANGES
  ===================================================================
                ###################################
                ### mod_perl Guide CHANGES file ###
                ###################################
  
  Nov 15 2001 ver 1.31
  
  * intro.pod: 
  
    o updated the long due credits section (~200 contributors! in total)
  
  * install.pod:
  
    o add "When DSO can be Used" (Doug)
  
  * modules.pod:
  
    o add Module::Use
  
    o add Apache::ConfigFile
  
  * debug.pod:
  
    o noted the fact that the technique of detecting aborted connections
      doesn't work with mod_proxy.
  
  * performance.pod:
  
    o removed from the last section a dead link of
      http://members.nbci.com/Alex_Maranda/gnuintel/GNUintel.htm
  
  * snippets.pod:
   
    o detecting SSL connection (Vivek Khera, Geoff Young, Issac Goldstand)
  
    o Note the Apache::Request->instance class method in addition to the
      POST2GET example (Robin Bjorn)
  
  * porting.pod:
  
    o s/headers_out/header_out/ where it was incorrectly used, 
      (Issac Goldstand)
  
    o fix the potential bug when using -r $file followed by -M _ and 'do
      $filename' inbetween, which may call stat() and _ won't include the
      right stat struct. (Randy Kobes)
  
  * troubleshooting.pod:
  
    o SegFaults During Startup
  
  * help.pod:
  
    o add a reference to http://lists.perl.org/
  
    o add references to the new mailing lists
  
  * code:
  
    o the results in lwp-bench.pl are correctly based on the number of
      @urls used (Boris Zentner)
  
  * strategy.pod:
  
    o new section: "Closing Lingering Connections with Lingerd"
  
  
  
  09.02.2001 ver 1.30
  
  * snippets.pod
  
    o adding a note about not being able to set Content-type and
      Content-encoding headers in non-OK responses (Geoffrey Young).
  
  * modules.pod
  
    o removing a myth about Apache::Registry suffering from
      "XYZ123456.html ==> /perl/foo.pl?p1=XYZ&p2=123456" kind of URI
      rewrites (multiply cached version of the same files). It's not
      true, at least it works OK now. (Geoff Young)
  
  * install.pod
  
    o covalent/ex-raven + mod_perl installation scenario nuked, since
      covalent doesn't distibute their sources any more, but only as a
      DSO package.
  
  * porting.pod
  
    o added a ref to Text::Reform in format/write section (Matt
      Sergeant)
  
    o s/\$^M/\$^T/ (Glenn)
  
    o explained the caveat of not having the path in @INC with
      Apache::Reload.
  
  * help.pod:
  
    o adding a link to High Availability project
  
    o adding a link to the list of Apache projects' mailing lists
  
  * Minor corrections:
  
    o Aaron Johnson went through the guide and cleaned it up from typos
      and mistakes in English.
  
    o strategy.pod: (Mike MacKenzie)
  
  
  
  04.26.2001 ver 1.29
  
  * dbm.pod:
  
    o updated "Flawed Locking Methods Which Must Not Be Used" with notes
      about lock upgrading (David Harris)
  
  * strategy.pod:
  
    o added a ref to a light and fast Boa webserver
  
  * scenario.pod:
  
    o cleared out the section on open proxying with mod_proxy (Eric Cholet)
  
  * multiuser.pod:
  
    o extended the "Virtual Servers Technologies" section with freevsd,
      freevmware, vmware and S/390 references.
  
  * snippets.pod:
  
    o removed the cache control section -- it's covered in the HTTP
      headers chapter.
  
    o subrequests and notes working together (Darren Chamberlain)
  
  * performance.pod:
   
    o "Interpolation vs List" update: wrongly used the 'concatenation'
      term instead of interpolation (Mark Summerfield)
  
    o "Interpolation, Concatenation or List" was rewritten
  
    o new: "Architecture Specific Compile Options" (Tim Bunce, Perrin
      Harkins, Greg Cope, Owen Williams, Vivek Khera, Steve Fink, James
      W Walden)
  
  * modules.pod:
  
    o Extended the docs of Apache::SubProcess module
  
  * porting.pod: 
  
    o "using register_cleanup in startup.pl to register cleanup action
      at the server shutdown or restart" (Doug)
  
  * config.pod: 
  
    o Cleared the item which was falsely stating that the globals
      defined in startup.pl cannot be seen by child process. (Richard
      Chen)
  
  * install.pod:
  
    o updated "Discovering Whether Some Option Was Configured": added
      Apache::MyConfig
  
    o debian/apt install notes updates (Neil Conway)
  
    o some callback hooks aren't enabled by ALL_HOOKS=1 (Neil Conway)
  
  * download.pod
  
    o update the location of mod_proxy_add_forward.c (Ask Bjorn Hansen)
  
  * help.pod
  
    o added a link to Andrew Ford's Apache and mod_perl pocket books.
  
    o added a link to take23.org
  
    o added a XS resources section 
  
    o added a link to the scalable list archive
  
    o remove the mailing list post address, to make it easier of Ask to
      filter SPAM.
  
  * troubleshooting.pod
  
    o new: "exit signal Segmentation fault (11)" with mysql: (Matt
      Sergeant)
  
    o improved the docs of fixing a broken /dev/null
  
  * debug.pod
  
    o updated "gdb says there are no debugging symbols" -- a simpler
      technique to have the binary unstripped during make install.
   
  * Minor corrections:
  
    o debug.pod (Alexander Farber)
    o performance.pod (Marc Lehmann, Kees Vonk)
    o snippets.pod (Ime Smits)
    o porting.pod (Michele Beltrame)
    o config.pod (Surat Singh Bhati, Paul Cotter )
    o control.pod (Aaron Johnson, Cliff Rayman, Yann Kerherv�)
    o modules.pod (Daniel Bohling)
    o install.pod (Kevin Swope, Jamie)
    
  
  
  01.11.2001 ver 1.28
  
  * Makefile.PL: fixed to generate a correct Makefile on Solaris:
    removed space/extra new line after some manually created targets
    (Lupe Christoph)
  
  * databases.pod:
  
   o "the morning bug" section was updated
  
  * scenario.pod:
  
    o updated the sample squid configuration directives to work with
      version 2.3.STABLE2.
  
  * strategy.pod:
  
   o corrected the math example with 56k modem, where the result was
     times and not seconds (Thanks to the students from my class at
     YAPC::Europe 2000)
  
   o corrected the math example using the MaxClient formula (for split
     code sets)
  
    o verified that squid's default read-ahead (send) buffer is of 16KB
  
  * dbm.pod:
  
    o reviewed/rewritten/corrected
  
  * performance.pod:
  
    o corrected another math example with calculating real MaxClients
      when shared memory is taken into an account (Thanks to an
      anonymous bearded student from from my class at YAPC::Europe 2000)
  
    o new section: "Measuring the Memory of the Process"
  
    o The "Forking and Executing Subprocesses from mod_perl" sections
      has been almost completely rewritten
  
    o The "Global vs. Fully Qualified Variables" section was rewritten
  
  * modules.pod:
  
    o Apache::PerlVINC has changed its interface, updating to ver .03
      (Todd Finney)
  
  * help.pod:
  
    o "Get help with DBI and SQL" updated with new resources
  
    o Added a link to a tuning manual of Solaris 2.x TPC/IP stack and
      other webserver related info. Useful for all Unix flavors.
  
  * install.pod:
  
    o removed the section with deb package links from David
      Huggins-Daines, since the link to the package is dead.
  
  * porting.pod:
  
    o new tip: "Redirecting STDOUT into a Scalar" (Chris Nokleberg)
  
  * Minor corrections:
  
    o perl.pod: Jean-Louis Guenego
    o performance.pod: (Richard Chen, Ulrich Neumerkel)
    o install.pod: (Artur Zambrzycki)
  
  
  11.26.2000 ver 1.27
  
  * intro.pod: 
  
    o. updated the long due credits section
  
  * correct_headers.pod:
  
    o added a link to an info page:
    "Prevent the browser from Caching a page"
  
  * multiuser.pod:
  
    o added a ref to "The User-mode Linux Kernel" project (related to
      running of many servers on the same machine safely to the system).
  
  * performance.pod:
  
    o added the http_load utility section
  
    o added notes about latency problems with db transactions
      "Persistent DB Connections" (Rodger Donaldson)
  
  * download.pod:
  
    o added links to http_load and Daquiri (only link to the backhand
      site) utilities.
  
  * troubleshooting.pod:
  
    o foo ... at /dev/null line 0 (Honza Pazdziora)
  
    o httpd keeps on growing after each restart (Perrin Harkins)
  
  * debug.pod:
  
    o suggestion to use warn while debugging (Kenny Gatdula)
  
    o extended the section on using strace(1).
  
    o documented the fact that Apache::Status shouldn't be used on
      production machines because of the overhead that it creates
      (Doug).
  
    o extended the section: "Safe Resource Locking and Cleanup Code"
      with localized file globs example (Doug)
  
  * help.pod:
  
    o added cgi-list subscription info (Peter J. Schoenster)
  
    o started new sections: 'Get help with Unix OS flavors -- Unix OS
      related resources' and 'Get help with Performance and Scalability'
  
  * perl.pod: 
  
    o new: Understanding Closures -- the Easy Way (Randal Shwartz,
    Perrin Harkins, Stas)
  
    o Exceptions section update: Exception::Class and Devel::StackTrace
    (Matt Sergeant and Dave Rolsky)
  
  * porting.pod: 
  
    o new item: "Return Codes under Apache::Registry" (Eric Cholet)
  
    o "-M and other time() file tests under mod_perl" -- added a nice
      TransHandler to handle the time resetting (Doug)
  
    o Adding local() at "-M and other time() file tests under mod_perl"
      (Andreas Koenig)
  
    o Documented Apache::Reload 
  
    o correction: PERL5LIB is not ignored when PerlTaintCheck is on
  
  * install.pod: 
  
    o mod_proxy_add_forward setup extended with notes about
    --permute-module to make it easy to place mod_proxy_add_forward
    before mod_proxy (Larry Leszczynski)
  
    o mod_perl and php install scenario (Roy Nasser)
  
    o reviewed and extensively edited.
  
    o added an info about Aphid Apache Installer
  
    o removed the section about experimental compile option
      PERL_MARK_WHERE since it should go away with 2.0 (it's still
      mentioned in debug.pod : "Finding the Line Which Triggered the
      Error or Warning" (Doug)
  
  * scenario.pod: reviewed and extensively edited.
  
  * strategy.pod: reviewed and extensively edited.
  
  * Minor corrections:
  
    o install.pod: (Neil Conway, Lance Cleveland)
  
    o perl.pod: (Pavel Shmidt)
  
    o config.pod: (Michael Rendell)
  
  
  
  
  
  08.22.2000 ver 1.26-fix-01
  
  * build: fixed the Makefile.PL to detect the prerequisites. MANIFEST
    file is OK now (didn't accept files starting with './') (Marcel
    Grunauer)
  
  * troubleshooting.pod: install_driver(Oracle) failed: Can't load
     '.../DBD/Oracle/Oracle.so' for module DBD::Oracle (Ed Park)
  
  
  
  
  08.21.2000 ver 1.26
  
  * mod_perl guide's build process is now completely migrated into an
    external module, distributed in package Pod::HtmlPSPdf.
  
  * install.pod: Raven SSL installation notes updated (Doug, Geoffrey
    Young)
  
  *  troubleshooting.pod: install_driver(Oracle) failed: Can't load
     '.../DBD/Oracle/Oracle.so' for module DBD::Oracle (Yann Ramin,
     Richard Chen)
  
  * security.pod: a code sample in "Non authenticated access for
    internal IPs, Authenticated for external IPs" is corrected (Will
    Trillich)
  
  * porting.pod: "CHECK Blocks" (Doug)
  
  * scenario.pod: "The dual server installation scenario was complete
    rewritten, to accomodate a better in my opinion directories layout
    and simpler installation process"
  
  * perl.pod: "Exception Handling for mod_perl" was improved by Matt
    Sergeant.
  
  * Minor corrections: 
    o config (Ron Pero)
  
  08.05.2000 ver 1.25
  
  * License: People asked me to redistribute mod_perl Guide in other
    ways than just mirroring the mod_perl site. Therefore I've licensed
    the guide under GPL and included the required info in the CPAN
    package.
  
  * perl:
  
    o added "Using Non-Hardcoded Configuration Module Names" (Chris Winters)
  
  
  * debug:
  
    o updated: "How can I find out if a mod_perl code has a memory leak"
  
    o rewritten:
      Handling the 'User pressed Stop button' case 
        Detecting Aborted Connections 
        The Importance of Cleanup Code 
          Critical Section 
          Safe Resource Locking and Cleanup Code 
  
  * config:
    
    o added a sub header "Running CGI, PerlRun, and Registry Scripts
      located in the same Directory" to make the info more prominent to
      find. (Ron Pero)
    
    o PerlAddVar info was added
   
  * control
  
    o "Swapping Prevention" rewritten from scratch and moved from
      performance chapter to control chapter (Ed Phillips, Barrie
      Slaymaker, Joshua Chamas)
  
    o "Preparing for Machine Reboot" -- added a section describing the
      chkconfig(8) use (Andreas Koenig)
  
  * porting: 
  
    o the wrong suggested solution to the nested sub problem was
      spotted!!! (Hunter Monroe, Hailei Dai) it's fixed now. 
  
    o update: "Terminating requests and processes, the exit() and
      child_terminate() functions" -- under perl5.6 you don't need to
      override exit anymore! (Doug, Eric Cholet)
  
    o s/PerlTaintMode/PerlTaintCheck/ (Gunther Birznieks)
  
  * review: 
  
    o Mark Summerfield has reviewed these chapters: modules,
    browserbugs, security and start.
  
  * performance:
  
    o "CGI.pm vs Apache::Request" and "Apache::args vs
      Apache::Request::param" were merged into a single section called:
      "Apache::args vs Apache::Request::param vs CGI::param"
  
    o The first example showing the use of ab was corrected (Joe
      Schaefer)
  
    o rewritten: 
  
      + Apache::Registry PerlHandler versus Custom PerlHandler
      + Keeping the Shared Memory Limit 
      + Limiting the Size of the Processes
      + Limiting Other Resources Used by Apache Child Processes
  
  * modules:  
  
    o "Apache::GTopLimit - Limit Apache httpd processes" merged into
      performance chapter.
  
    o Apache::PerlVINC configuration corrected (Patrick)
  
  * Minor corrections: 
    o config (Carl Hansen, Ron Pero, Jeff Chan, Cliff Rayman, Marcel Grunauer)
    o control (Marcel Grunauer)
    o debug (Marcel Grunauer)
    o install (Ron Pero)
    o snippets (Ask Bjoern Hansen, Chris Nokleberg)
    o perl (Will Trillich, Cliff Rayman, Jason Rhinelander)
    o porting (Ged Haywood)
  
  
  
  06.07.2000 ver 1.24
  
  * perl: "catching exceptions" -- a few corrections (Matt Sergeant)
  
  * modules: added Apache::Gzip (Ken Williams)
  
  * guide's design
  
    o put back the links underlining
  
    o background-color: #ffffee;
  
    o added (jump) menus to reach search/download/index from everywhere
    
    o added the two new search engines (both working on the split
      version of the guide)
  
  * guide's build: 
  
    o The build code was completely rewritten, html2ps is now bundled
      with the guide so one can create PS version, and if there is
      ps2pdf the PDF version.
  
    o It can produce the split version of the Guide.
  
    o One can easily reuse the package to build his own docs, since the
      look-n-feel has been moved into the templates from the code.
  
    o Once I feel confident I'll probably separate the build code from
      the Guide to give it its own life and make it easier to reuse by
      other developers. I need testers who want to use this package
      before I release it a separate module.
  
  * performance: 
  
    o old sections rewritten/improved:
      - Are My Variables Shared?
      - Preloading Registry Scripts
      - Calculating Real Memory Usage
      - Preloading Perl Modules at Server Startup 
      -  Preloading Registry Scripts at Server Startup 
      - Forking and Executing Subprocesses from mod_perl 
        = Spawning a Detachable Sub-Process
        = Gory Details About fork()
        = Executing system() in the Right Way
        = Avoiding Zombie Processes
  
    o new sections:
      - Apache/mod_perl Build Options
        = mod_perl Process Size as a Function of Compiled in C Modules and
                 mod_perl Features
      - Modules Initializing at Server Startup
        = Initializing DBI.pm (corrections by Tim Bunce)
        = Initializing CGI.pm
  
  * control: 
  
    o These sections were rewritten and extended:
      - Server Maintenance Chores 
          = Handling Log Files 
            + Log Rotation 
            + Non-Scheduled Emergency Log Rotation 
  
    o 'cyclog' is now called multilog (from daemontools package)
  
    o Moved from debug and rewritten "Speeding up the Apache Termination
      and Restart"
  
    o Rewritten and extended "SUID Start-up Scripts" with: 
      "Introduction to SUID Executables"
      "Apache Startup SUID Script's Security"
      "Sample Apache Startup SUID Script"
  
  * hardware: partly rewritten and improved.
  
  * reconstruction process: obvious.pod has been disassembled and merged
    into debug.pod.
  
  * debug: update: Apache::DumpHeaders (Ask Bjoern Hansen)
  
  * troubleshooting: PerlFreshRestart is irrelevant for DSO (Vivek
    Khera, Doug)
  
  * review: 
  
    o Drew Taylor has reviewed these chapters: scenario and strategy
      chapters.
  
    o Mark Summerfield has reviewed these chapters: scenario, perl and
      strategy chapters.
  
    o Eric Cholet has reviewed the porting chapter.
  
  * Minor corrections: 
    o download (Salve J Nilsen)
    o performance (w trillich) 
    o perl (Scott Holdren) 
    o snippets+download (Ask Bjoern Hansen) 
    o debug (Robert Mathews) 
    o scenario (Tuomas Salo)
  
  
  
  
  
  
  05.12.2000 ver 1.23
  
  * guide's layout changed: Now there are two index files -- the default
    index.html shows only the names of the chapters in TOC, the new
    index_long.html shows the full TOC as before.
  
  * guide's layout changed: Changed to simple black_on_white no fancy
    frames and colors anymore
  
  * guide: Changed the order of the the chapters towards logical
    sequentiality. 
  
  * snippets: new: "Sending Cookies in REDIRECT Response" (Doug)
  
  * help: new: added the digest list info (Ask Bjoern Hansen)
  
  * performance: new: "Limiting the Number of Processes Serving the Same
    Resource"
  
  * troubleshooting: updated: "RegistryLoader: Translation of uri [...]
    to filename failed"
  
  * porting: update: "using format() and write()" -- using sprintf (Matt
    Sergeant)
  
  * perl: new: "Variables Globally, Lexically Scoped And Fully
    Qualified" (Ged W. Haywood)
  
  * build suite: documenting the build script so others could reuse this
    code in their documentation generation chores.
  
  * performance: a complete reorganizing of the content toward a better
    navigatibility.
  
  * strategy: removed "Multithreading or not Multithreading" -- has
    flaws and needs a rewrite
  
  * performance: update: "KeepAlive" -- works only if there is a
    Content-Length header (Andreas Koenig)
  
  * help: new: "get help with CVS"
  
  * troubleshooting: new: "Segfaults when using XML::Parser" (DeWitt
    Clinton)
  
  * performance: new: "Do Not Run Everything on One mod_perl Server"
    (Joshua Chamas, Shane [EMAIL PROTECTED], Gunther Birznieks)
       
  * minor corrections: scenario (Eric Jain), debug (Geoffrey Young).
  
  * install: new: "mod_perl and Raven SSL" (Doug)
  
  * scenario: new: "mod_proxy: Security Issues" (Eric Cholet)
  
  * performance: new: "Improving Perl Code Performance Hints" =>
    "Concatination or List" (Doug)
  
  * debug: new: "hanging processes detection: Using the Perl Trace"
  
  * debug: new:  "Hanging because of the OS Problem" (Greg Stark)
  
  * install: new: "About gdbm, db and ndbm libraries" (Les Mikesell)
  
  * performance: new: "Benchmarking Apache::Registry and Perl Content
    Handler"
  
  * performance: new: "Benchmarking CGI.pm and Apache::Request"
  
  * porting: new: "Transitioning from Apache::Registry to Apache
    handlers"
  
  * config: new: "Alias and Rewrite Conflicts" (Eric Cholet, Ask Bjoern
    Hansen, Vivek Vhera)
  
  * scenario: new: "Front-end Back-end Proxying with Virtual Hosts"
    (Vivek Vhera, Eric Cholet)
  
  * install: new: "APACHE_USER and APACHE_GROUP env" (Doug)
  
  * config: new: "Overriding <Location> Setting in "Sub-Location""
    (Darren Chamberlain, Vivek Vhera)
  
  * review: Mark Summerfield has reviewed these chapters: porting,
    correct_headers, intro, multiuser, snippets and performance.
  
  * troubleshooting: new: "Processes Get Stuck on Graceful Restart" (Doug)
  
  * debug: updated: "Safe Resource Locking" added utils to trace the
    open files owner processes. (Doug, Eric Cholet)
  
  * databases: new: "Database Locking Risks"
  
  * performance: update: "Limiting the Resources Used by httpd
    Children", explanation of the soft and hard limits (Eric Cholet)
  
  * help: added subscription info for perl5-porters mailing list
  
  * perl: new: "Exception Handling for mod_perl" (Matt Sergeant)
  
  * review: Ged W. Haywood was very kind to review and correct the
    config, perl, dbm, snippets, advocacy, browserbugs, download, help,
    modules, troubleshooting, multiuser, obvious, correct_headers,
    status and hardware chapters.
  
  * modules: new: "Apache::RequestNotes" (Geoffrey Young)
  
  
  
  
  
  
  04.09.2000 ver 1.22
  
  * intro: updated the long due credits section
  
  * snippets: new: "Authentication Snippets" (Michael Finke, Eric
    Cholet)
  
  * debug: new: "Apache::DumpHeaders" (Ask Bjoern Hansen)
  
  * debug: new: "Apache::DebugInfo" (Geoffrey Young)
  
  * config: updated: "PerlFreshRestart" with DSO notes (Doug, Vivek
    Khera)
  
  * troubleshooting: new: "Can't upgrade that kind of scalar ..." (Doug)
  
  * performance: new: "Bloatware" (Tom Christiansen)
  
  * performance: new: "CGI.pm versus Apache::Request" (Doug)
  
  * performance: new: "Apache::Registry versus pure PerlHandler" (Doug)
  
  * performance: new: "TMTOWTDI: Convenience and Performance"
  
  * install: "Is it possible to run mod_perl enabled Apache as suExec?"
    (Randal L. Schwartz, Matt Sergeant)
  
  * performance: updated: "Benchmarking PerlHandlers"
  
  * performance: new: "Keeping the Shared Memory Limit"
  
  * porting: new: "File tests operators"
  
  * performance: updated "PerlSetupEnv Off" with a test script
  
  * snippets: new: "Getting the Front-end Server's Name in the Back-end
    Server" (Sean Dague)
  
  * porting: extended "Taint mode" -- a suggestion for services that
    move to mod_perl and have part of the scripts that won't run under
    enabled Taint mode (Gunther Birznieks, Ken Williams)
  
  * modules: new: "Apache::OutputChain -- Chain Stacked Perl Handlers"
    (Honza Pazdziora, Eric Cholet)
  
  * snippets: new: "SSI and Embperl -- Doing Both" (Michael Schout)
  
  * performance: new: "-Dusemymalloc Perl Build Option" (Doug, Jeffrey
    W. Baker)
  
  * strategy: added unedited answer for the question "Multithreading or
    not Multithreading" (Shane [EMAIL PROTECTED])
  
  * install: updated "Automating installation" (James G Smith)
  
  * performance: added a workaround for limiting memory usage under
    Linux to "Limiting the Resources Used by httpd Children" (Kevin
    Murphy)
  
  * debug: added clarification fd leakage in "Safe Resource Locking"
    (Bill Moseley)
  
  * multiuser: extended the scenario for multi-user/multi-webserver on
    one machine option, how to enforce port and similar settings without
    forbidding user to modify his httpd.conf. (inspired by Greg Cope)
  
  * scenario: new: "Caching in mod_proxy" (Ilya Obshadko)
  
  * porting: new: "Accessing Request Object in non-Perl*Handler Modules"
    (Doug)
  
  * config: new: "Adding Custom Configuration Directives" (Doug)
  
  * snippets: new: "Convert a POST Request into a GET Request" (Doug)
  
  * scenario: extended the "Getting the Remote Server IP in the Back-end
    server in the Proxy Setup" section with notes about the modules
    configuration precedence problem and its solution. (Ilya Obshadko,
    Ewan Edwards)
  
  * strategy: new "Pros and Cons of Building mod_perl as DSO" (based on
    http://www.apache.org/docs/dso.html by Ralf S. Engelschall)
  
  * strategy: new: "mod_perl and mod_ssl" (Tom Mornini, Vivek Khera, Mads
    Toftum)
  
  * snippets: added a "PerlTransHandler example" (Randal L. Schwartz,
    Ajay Shah)
  
  * config: extended examples for "<Perl> sections", added the package
    declaration caveat, Apache::ReadConfig namespace and more.
  
  * performance: extended the "Reducing the Number of stat() Calls"
    section, with examples of PerlTransHandler and reducing stat()s for
    .htaccess look-ups
  
  * help: added the perlmonth.com and Gerald Richter's article reference
  
  * porting: updated the "Right headers generation": OS with EBCDIC as
    character set and sending proper headers. (Doug)
  
  * porting: added: "Apache::print() and CORE::print()" (Doug)
  
  * debug: updated: Sys::Signal, SIGALRM and perl5.6 issue (Doug and Bill
    Moseley)
  
  * config: $Apache::Server::StrictPerlSections (Doug)
  
  * scenario: more practical mod_rewrite examples by Geoffrey Young.
  
  * minor corrections: download.pod by Ron Pool, scenario.pod by Terry
    West and Eric Cholet, config.pod by Richard More, performance.pod by
    Stephen Judd and Steve Reppucci, frequent.pod by Andreas Piesk.
  
  * review: Mark Summerfield has joined the group of the kind people who
    review and correct the Guide. He has reviewed these chapters:
    advocacy, config, control, databases, dbm, debug, download,
    frequent, hardware, help, install.
  
  * review: Ged W. Haywood was very kind to review and correct the
    debug chapter.
  
  
  
  
  03.04.2000 ver 1.21
  
  * help: updated the mod_perl list subscription info, added the
    advocacy list info.
  
  * download: HA section - added more links (Robin Berjon, Gerd Knops)
  
  * strategy: added a new section "When One Machine is not Enough for
    SQL DB and mod_perl" (Vivek Khera, Jeffrey W. Baker, John Armstrong,
    Mike Miller, Leslie Mikesell, Tom Brown, Ed Phillips, Marko van der
    Puil) Special thanks to Jeffrey for working on the this section
    before it went into the Guide.
  
  * perfomance: added the section "Memory Swapping is Considered Bad"
  
  * modules: added the Apache::GTopLimit and Apache::Watchdog::RunAway
    modules description
  
  * performance: new section "Is my Code Shared"
  
  * download: added links to DB sites and "Low-Cost Unix Database
    Differences" comparison page.
  
  * config: new section "Enabling Remote Server Configuration Reports"
  
  * performance: new section "Calculating Real Memory Usage"
  
  * scenario: mod_proxy section has been rewritten and extended with
    lots of OS specific information (Joshua Chamas, Oleg Bartunov,
    Gerald Richter, Radu Greab, Peter Haworth, Vivek Khera)
  
  * performance: rewritten "Object Methods Calls Versus Function Calls"
    and splitted into the original section and "Imported Symbols and
    Memory Usage" (James G Smith, David Mitchell, Doug, Matt Sergeant,
    Bill Moseley, Autarch)
  
  * performance: added "Reducing the Number of stat() Calls" (Steve
    Reppucci)
  
  * install: added a note of how to force CPAN shell to install mod_perl
    when the same version is already installed. 
  
  * scenario: added "mod_rewrite Examples" (Vivek Khera, Tom Mornini)
  
  * databases: added "Optimize: Run Two SQL Engine Servers" (Pascal
    Eeftinck)
  
  * snippets: added "Using DESTROY to Finalize Output" with solutions to
    redirection loghandler fixes (Michael Blakeley)
  
  * scenario: added to the section "One Plain and One mod_perl enabled
    Apache Servers" --target option to make the ./configure stage
    somewhat simpler. (James Furness)
  
  * snippets: added 2 modified versions of the mysql backup/restore
    scripts to work with 3.22.30+ (Ward Vandewege)
  
  * Moved complete files, previously enlisted in the text, to the
    external files. This makes the browsing faster when the files are
    very big.  Added new POD tag =code to create the link to the
    external file at the pod2html conversion stage
  
  * review: Ged W. Haywood was very kind to review and correct the
    following chapters: control, install
  
  * performance: in "Forking and Executing Subprocesses from mod_perl"
    fixed a few typos in code. (Anthony D. Ettinger, Matt Sergeant and
    Edwin Pratomo)
  
  
  
  
  
  02.09.2000 ver 1.20
  
  * I've created a mod_perl_guide package and now it's available from
    CPAN! So when a new version gets released you just open your CPAN
    shell and do:
  
    % perl -MCPAN -eshell
    cpan> install mod_perl_guide
  
    All the POD sources and build scripts has moved to this
    distribution, get them from CPAN. 
  
  * There is no more PostScript file book style distribution. A better
    PDF version has replaced it (suggested by Andrew Ford). Note that
    you can use gv (ghostview) to view pdf files, if you don't have a
    free acroread installed. Or you can always use 'pdf2ps' utility to
    convert it back to PS.
  
    BTW this release's PDF file consists of 481 Letter format pages.
  
  * install: rewritten "What Compiler Should Be Used to Build mod_perl?"
  
  * performance: added "Measuring the Subroutines Memory Usage"
  
  * install: added "using http://localhost/perl-status?hooks"; to check
    what hooks where enabled during the build process.
  
  * debug: extended the "Looking inside the server" section with new
    info about Apache::Status, mod_status and Apache::VMonitor
  
  * snippets: added "Emulating the Authentication Mechanism" (Eric
    Cholet)
  
  * troubleshooting: runtime: added "Can't call method
    "register_cleanup" (CGI.pm)" (Doug, Nick Tonkin)
  
  * snippets: added "Setting PerlHandler Based on MIME Type" (Doug)
  
  * install: .makepl_args.mod_perl in $HOME, otherwise
    makepl_args.mod_perl (no leading dot) (Ask Bjoern Hansen)
  
  * databases: completed the DBI trace info (Doug Kyle)
  
  * modules: added and extended the Apache::Registry{BB|NG} sections
    (Doug)
  
  * config: added "debugging <Perl> sections" (Doug)
  
  * modules: added a ref to "Apache::RedirectLogFix" (Doug)
  
  * snippets: added "Passing Notes Between mod_perl and other (non-perl)
    Apache Modules" (David Harris)
  
  * snippets: added "mod_rewrite Based On Query String and URI
    Implemented in Perl" (Darren Chamberlain)
  
  * config: added "Knowing the proxy_pass'ed Connection Type" (Geoff
    Crawshaw)
  
  * perl: the long time ago added regexp section "lifted" from
    mod_perl_traps.pod was written by David Landgren and I apologize for
    not stating it here (I didn't know it was David who wrote it). Now
    everything is in its place.
  
  * perl: added a workaround to make FindBin work under mod_perl (Joao
    Fonseca)
  
  * install: "make test troubleshooting": added the 'localhost' entry
    existance in '/etc/hosts' verification.
  
  * debug: explained the confusion about Sys::Signal and SIGALRM (was
    already solved internally) (Doug)
  
  * perl: John Hyland and Kavitha have pointed out a few typos of mine
    :)
  
  * dbm: mentioned DB_File::Lock and Tie::DB_FileLock modules and added
    a summary of all available dbm lock wrappers ripped of from the
    DB_File::Lock manpage. (David Harris)
  
  * installation: APACHE_PREFIX works only if APACI_ARGS is used (Joe
    Slag)
  
  * databases: fixed sql_escape from My::DB module's example (Yann
    Kerherv�)
  
  * databases: starting from Apache::DBI 0.84 Apache::DebugDBI is
    discarded. (Mike Depot)
  
  * debug: added the "PERL_DEBUG=1 Build Option" 
  
  * debug: added the "PERL_DESTRUCT_LEVEL Environment Variable" (Doug)
  
  * databases: updated the DBI wrapper module 
  
  * install: added to the troubleshooting section the solution to
    'PL_perl_destruct_level errors' problem, which happened during the
    mod_perl build process.
  
  * review: Ged W. Haywood was very kind to review and correct the
    following chapters: frequent, perl, performance (!), scenario (!)
  
  * perl: rewritten "Tracing Warnings Reports" section
  
  * porting: fixed a bad mistake in the code that prints header only
    once in the "Generating correct HTTP Headers" section. Added more
    material about headers as well.
  
  * config: added "Options Values Merging" section
  
  * config: the chapter was partially rewritten and reorganized
  
  * config: completed the "Apache Restarts Twice On Start" section
  
  
  
  
  
  12.19.99 ver 1.19
  
  * all.html has gone (all htmls in one) -- it became more than 1Mb, too
    big - use the PS version instead
  
  * reorg: moved the "perl reference" chapter to be one of the first
    ones, because it should be read first. Moved the strategies and
    implementations toward the middle.
  
  * snippets: started "Code Unloading" as hinted by Doug.
  
  * porting: updated "Output from system calls" (Doug)
  
  * porting: fixed the "\n\n" vs. "\r\n\r\n"(Philip Newton)
  
  * debug: added "Debugging when Server Crashes on Startup before
    Writing to Log File" (Cliff Rayman)
  
  * snippets: added "Redirecting While Maintaining Environment
    Variables" (Vivek Khera)
  
  * troubleshooting: added "libexec/libperl.so: open failed: No such
    file or directory" (Christophe Dupre)
  
  * performance: added "Upload/Download of Big Files" (Ken Williams)
  
  * install: added a reference to "Static debian package" (David
    Huggins-Daines)
  
  * troubleshooting: added Windows: "Apache::DBI and
    PERL_STARTUP_DONE_CHECK" (Gerald Richter, Randy Kobes)
  
  * performance: added KeepAlive notes (Craig, Pascal Eeftinck)
  
  * performance: added HTML::Mason notes (Pascal Eeftinck)
  
  * porting: added new "die() and mod_perl"
  
  * porting/perl: moved most of the perl specific reference material
    into perl.pod removing duplications of this material on the way and
    replacing it with pointers to perl.pod
  
  * performance: rewritten "Object Methods Calls Versus Function Calls"
  
  * porting: FindBin is not mod_perl compatible (Andrei A. Voropaev,
    Joao Fonseca)
  
  * scenario: denoted the ProxyReceiveBufferSize limit by SO_RCVBUF in
    kernel (Vivek Khera) and kern.ipc.maxsockbuf=2621440 on BSD (Oleg
    Bartunov)
  
  * snippets: added "mysql backup and restore scripts"
  
  * snippets: added "Subclassing Apache::Request example"
  
  * snippets: added "CGI::params in the mod_perl-ish way"
  
  * debug: added "Using print() and Data::Dumper for Debugging"
  
  * snippets: started a "Sending email from mod_perl" topic
  
  * control: Preparing for Machine Reboot
  
  * download: added more load ballancing URLs
  
  * performance: added "Tuning with httperf"
  
  * intro: added "High-Profile Sites Running mod_perl" (Rex Staples)
  
  * review: Ged W. Haywood was very kind to review and correct the
    following chapters: start, intro, strategy, porting (!), databases,
    dbm, security.
  
  * install.pod: perl Makefile.PL troubleshooting - added "A test
    compilation with your Makefile configuration failed..." and
    "missing/misconfigured libgdbm.so" (Tom Brown and Steve Willer)
  
  * install.pod: make troubleshooting "unrecognized format specifier for..."
    during the build process (Scott Fagg)
  
  * porting: a bug in a script from "Exposing Apache::Registry secrets"
    spotted and fixed (John Deighan)
  
  * install.pod: integrated the "manual mod_perl build process" remarks
    and patch (Robin Berjon)
  
  * install.pod: don't put mod_perl sources in a sub-dir of Apache
    sources. It wouldn't build! (Ask Bjoern Hansen)
  
  * review: Dale Couch was very kind to review and correct the
    following chapters: porting
  
  
  
  
  11.13.99 ver 1.18
  
  * An almost complete rewrite of debug.pod:
  
  (Integrated Doug's debugging article at perlmonth.com)
  
    Curing The "Internal Server Error" 
    Helping error_log to Help Us 
    The Importance of Warnings 
        diagnostics pragma 
    Monitoring the error_log file 
    Hanging processes: Detection and Diagnostics 
        An Example of the Code that Might Hang the Process 
        Detecting hanging processes 
        Determination of the reason 
    Handling the 'User pressed Stop button' case 
        Detecting Aborted Connections 
        The Importance of Cleanup Code 
             Critical Section 
             Safe Resource Locking 
             Cleanup Code 
    Handling the server timeout cases and working with $SIG{ALRM} 
    Watching the server 
        Configuration 
        Usage 
        Compiled Registry Scripts section seems to be empty. 
    Sometimes script works, sometimes does not 
    Code Debug 
        Locating and correcting Syntax Errors 
        Using Apache::FakeRequest to Debug Apache Perl Modules 
        Finding the Line Number the Error/Warning has been Triggered at 
        Using print() Function for Debugging 
        The Importance of Good Coding Style and Conciseness 
        Introduction into Perl Debugger 
        Interactive Perl Debugging under mod_cgi 
        Non-Interactive Perl Debugging under mod_perl 
        Interactive Perl Debugging under mod_perl 
        Interactive Perl Debugging under mod_perl and ptkdb 
        Debugging core Dumping Code 
    Apache::Debug 
    Debugging Core Dumps 
    Debug Tracing 
    gdb says there are no debugging symbols 
    Debugging Signal Handlers ($SIG{FOO}) 
    Code Profiling 
    Devel::Peek 
    How can I find if my mod_perl scripts have memory leaks 
    Debugging your code in Single Server Mode 
  
  * A complete rewrite of install.pod:
  
  (Integrated the INSTALL.* docs from the mod_perl distribution)
  
    Installing mod_perl in 10 Minutes and 10 Command Lines 
    The Gory Details 
        Sources Configuration (perl Makefile.PL ...) 
             Configuration parameters 
                 APACHE_SRC 
                 DO_HTTPD, NO_HTTPD, PREP_HTTPD 
                 Callback Hooks 
                 EVERYTHING 
                 PERL_TRACE 
                 APACHE_HEADER_INSTALL 
                 PERL_STATIC_EXTS 
                 PERL_MARK_WHERE 
                 APACHE_PREFIX 
                 APACI_ARGS 
             Reusing Configuration Parameters 
             Discovering whether some option was configured 
             Using an alternative Configuration file 
        mod_perl Building (make) 
             make Troubleshooting 
                 undefined reference to 'Perl_newAV' 
        Built Server Testing (make test) 
             Manual Testing 
             make test Troubleshooting 
                 make test fails 
                 mod_perl.c is incompatible with this version of apache 
                 make test......skipping test on this platform 
        Installation (make install) 
        Building Apache and mod_perl by Hand 
    Installation Scenarios for Standalone mod_perl 
        The All-In-One Way 
        The Flexible Way 
        Build mod_perl as DSO inside Apache source tree via APACI 
        Build mod_perl as DSO outside Apache source tree via APXS 
    Installation Scenarios for mod_perl and Other Components 
        mod_perl and mod_ssl (+openssl) 
        mod_perl and mod_ssl Rolled from RPMs 
        mod_perl and apache-ssl (+openssl) 
        mod_perl and Stronghold 
             Note For Solaris 2.5 users 
    mod_perl Installation with CPAN.pm's Interactive Shell 
    Installing on multiple machines 
    using RPM, DEB and other packages to install mod_perl 
        A word on mod_perl RPM packages 
        Getting Started 
        Compiling RPM source files 
        Mix and Match RPM and source 
        Installing a single apache+mod_perl RPM 
        Compiling libapreq (Apache::Request) with the RH 6.0 mod_perl RPM 
        Installing separate Apache and mod_perl RPMs 
        Testing the mod_perl API 
    Installation Without Superuser Privileges 
        Installing Perl Modules into a Directory of Choice 
        Making Your Scripts Find the Locally Installed Modules 
        CPAN.pm Shell and Locally Installed Modules 
        Making a Local Apache Installation 
        Actual Local mod_perl Enabled Apache Installation 
        Local mod_perl Enabled Apache Installation with CPAN.pm 
    Automating installation 
    How can I tell whether mod_perl is running 
        Testing by checking the error_log file 
        Testing by viewing /perl-status 
        Testing via telnet 
        Testing via a CGI script 
        Testing via lwp-request 
    General Notes 
        Should I rebuild mod_perl if I have upgraded my perl? 
        Perl installation requirements 
        mod_auth_dbm nuances 
        Stripping apache to make it almost perl-server 
        Saving the config.status Files with mod_perl, php, ssl and Other 
Components 
        Should I Build mod_perl with gcc or cc? 
    OS Related Notes 
  
  * databases: added "Debugging code which deploys DBI"
  
  * porting: added "STDIN, STDOUT and STDERR streams"
  
  * advocacy: added "A summary of perl/cgi discussion at slashdot.org"
  
  * snippets: added "Terminating a child process on Request Completion"
    (Doug)
  
  * troubleshooting: added "Apache.pm failed to load!" (Doug)
  
  * snippets: added "Reading POST Data, then Redirecting" (Doug)
  
  * snippets: added "Cache control for regular and error modes" (Cliff
    Rayman)
  
  * performance: added "Be carefull with symbolic links" (the same
    script compiled twice)
  
  * install: new "apache/mod_perl/mod_ssl Rolled from RPMs Scenario"
    (Stephane Benoit)
  
  * porting: 'use subs (exit)' typo fixed (Chris Nokleberg)
  
  * warnings.pod was renamed to troubleshooting.pod and now it's
    categorized by the following sections:
        Building and Installation
          Configuration and Startup 
          Code Parsing and Compilation 
          Runtime
        Shutdown and Restart 
  
  * porting: the following sections were moved to debug.pod:
    "Finding the Line Number the Error/Warning has been Triggered at",
    "Turning warnings ON",
    "diagnostics pragma"
  
  * porting: rewritten "Comman line Switches (-w, -T, etc)"
  
  * performance: "Forking or Executing subprocesses from mod_perl"
    updated with another CHLD sighandler using WNOHANG to reap zombie
    processes (Lincoln Stein)
  
  * install: updated "Testing via a CGI script" (Geoffrey S Young)
  
  * porting: updated "Terminating requests and processes, exit()
    function" with info about post_request termination,
    Apache::SizeLimit and Apache::GTopLimit
  
  * perlformance: links from
   http://www.realtime.net/~parkerm/perl/conf98/index.htm
   and http://www.realtime.net/~parkerm/perl/conf98/sld006.htm
   were dead (I removed them :( (Peter Skov)
  
  * snippets: added "Caching the POSTed Data" (Doug)
  
  * install: "Compiling libapreq with mod_perl RPM" reviewed and
    corrected (Geoffrey S Young)
  
  * status.pod has been eliminated and absorbed by debug.pod where it
    belong
  
  * Fixed pod translator. Now it handles correctly C<$r-E<gt>method>
    encodings. (Andreas Koenig)
  
  
  
  
  
  10.16.99 ver 1.17
  
  * intro: CREDITS section was updated with the long list of
    contributors!!! Thank you all!!! If I've missed your name, please let
    me know!!!
  
  * control: added "Safe Code Updates on a Live Production Server"
  
  * control: added "An Intentional Disabling of Live Scripts"
  
  * scenario: added a big new section "One Light and One Heavy Servers
    where ALL htmls are Perl-Generated" (Wesley Darlington)
  
  * dbm: David Harris has detected a corruption with the suggested
    locking methods in the Camel book and DB_File man page (at least
    before the version 1.72). They are flawed and if you use them in the
    environment where more than one process modify the dbm file, it can
    get corrupted!!! I've modified the DB_File::Lock module to fix the
    problem by integrating the previously written DB_File::Wrap and the
    module David wrote (David Harris)
  
  * snippets: added "Sending multiply cookies with Perl API" (Rick
    Myers)
  
  * install: added a big section "using RPM, DEB and other packages to
    install mod_perl" (Geoffrey S Young, David Harris)
  
  * install: added "Automating installation" - James G Smith's Apache
    Builder script
  
  * install: added a new section "using CPAN to install mod_perl"
  
  * performance: extended the "Forking or Executing subprocesses from
    mod_perl" with information and code to avoid zombies.
  
  * performance: added a converted to pod "Jeff's guide to mod_perl
    database performance" (Jeffrey W. Baker)
  
  * new chapter: "Correct Headers" contributed by Andreas Koenig!!!
  
  * help: updated the link to DBI homepage (hermetica has gone)
  
  * performance: added sizing benchmarks of CGI.pm's imported
    symbols. (CGI.pm's object methods calls vs. function calls)
  
  * porting: fixed a typo with local() and Special variables (Andrei
    A. Voropaev)
  
  * snippets: fixed a taint problem in the sample error_log display
    script.(John Walker)
  
  * install: added "Should I Build mod_perl with gcc or cc" (Tom Hughes)
  
  * warnings: added to the troubleshotting section "Missing right
    bracket at line " with a link to the item explaining that in
    porting.pod ("__END__ and __DATA__ tokens") (Eric Strovink)
  
  * install: added a tip of saving config.status files for each module
    build (php. mod_perl, ssl) for a later easier reuse. (Dave
    Hodgkinson)
  
  * performance: added clarification to "PerlSetupEnv Off" item (Doug)
  
  * snippets: added "Passing environment variables between handlers"
    (Doug)
  
  * warnings: added "Can't locate loadable object for module XXX" (Doug)
  
  * config: corrected the <Perl> section dump typo (Gerald Richter)
  
  * scenario: corrected the snippet to extract the client IP from the
    X-Forwarded-For header to use headers_in instead of the obsolete
    header_in (Oleg Bartunov)
  
  * scenario: added a note about "Ben Laurie's Apache-SSL setting
    REMOTE_ADDER instead of X-Forwarded-For header (Jie Gao)
  
  * performance: started "Analysis of SW and HW Requirements" (Jeffrey
    W. Baker)
  
  * warnings: clarification of "rwrite returned -1" (Eric Cholet)
  
  * warnings: added "Invalid command 'PerlHandler" (Doug) 
  
  * debug: started "Apache::Debug" and Carp::confess("init") (Doug) 
  
  * install: "undefined reference to 'Perl_newAV'" documented (Doug)
  
  * modules: added a clarification about Apache::PerlVINC (Doug)
  
  * warnings: updated the "Callback Called Exit & -D
    PERL_EMERGENCY_SBRK" (Doug)
  
  * databases: added $Apache::DBI::DEBUG = 2 (instead of '1') for ver
    0.84+ (Edmund Mergl)
  
  * performance: added "Caching prepare() statements" + rolling your own
    Apache::DBO code (Jeffrey Baker)
  
  * porting: replaced "Apache::Registry::handler" with
    "Apache::Registry". It caused problems to some of the users (Daniel
    W. Burke)
  
  * performance: added "Increasing the shared memory with mergemem" (no
    real info but a link to the util's site. Please take a look and
    submit your opinions) (Doug Bagley)
  
  * snippets: added "Redirect a POST request, forwarding the content"
   (Eric Cholet, Doug)
  
  * performance extended the "Using $|=1 under mod_perl and better
    print() techniques" with notes about rflush()
  
  * shuffled many items around to help more intuitive search of the them
  
  * performance: added "Cached stat() calls"
  
  09.26.99 ver 1.16
  
  * Many little things fixed, rewritten - didn't worth listing them all
    here.
  
  * index.html: added another search box for only mod_perl FAQs and the
    guide provided by perlreference.com
  
  * config: added a note about Apache restarting twice on start
  
  * warnings: added "syntax error at /dev/null" - broken /dev/null
    (Doug)
  
  * porting: added "Special Perl Variables" using local()
  
  * multiuser: Added the considerations not to let users to execute their
    CGI scripts inside mod_perl server because of file permissions
    (non-mod_perl problem) and a possibility to hijack a DBI connection
    from Apache::DBI pool of cached connections (Peter Galbavy)
  
  * install: added "Is it possible to tell whether some option was
    included" nm() hints (Doug)
  
  * performance: new "PerlSetupEnv Off" (Doug)
  
  * porting: new section "Passing and preserving custom data structures
    between handlers" (Ken Williams)
  
  * security: "OK, AUTH_REQUIRED.and FORBIDDEN" in authentication
    phase. (Eric Cholet, Jason Bodnar)
  
  * porting: rewrote the "Generating correct HTTP Headers" section, to
    talk about HEAD requests, PerlSendHeader, Perl API to handle the
    headers generation, Cookie headers, closure methods to send headers
    only once.
   
  * Purifications: I'm very grateful to the people who take their time
    to help me to improve the guide's readablility. This time Richard
    A. Wells and Frank Schoeters submitted a few corrections to the
    text. Keep these corrections coming. Thanks!
  
  * porting: extended the "Forking and Starting Sub-processes with
    mod_perl" section (Les Mikesell, Randal L. Schwartz )
  
  * porting: Wrote a whole new section "Configuration Files: Writing,
    Modifying and Reloading.", which consist of 3 big parts:
              Writing Configuration Files 
                Reloading Configuration Files 
                Dynamically updating configuration files 
  
  * scenario: updated the X-Forwarded-For> section with notes of
    non-reliability. (Ask Bjoern Hansen, Vivek Khera)
  
  * porting: started the "Sharing variables between processes" section
    (Eric Cholet)
  
  * config: dumping the configuration by <Perl> sections (Eric Cholet)
  
  * performance: prepare_cached() in persistent connections.
  
  * help.pod: updated a link to Jefferey W. Baker's DBI examples
    (Jefferey W. Baker)
  
  * a list of mailing list archives was updated (Andreas J. Koenig, Jan
    Peter Hecking, Matthew Darwin, Christof Damian, Geoffrey S Young)
  
  * debug.pod: "Spinning httpds" section from mod_perl.pod
  
  * config.pod: have stolen the sections "PERL METHOD HANDLERS",
   "STACKED HANDLERS" and "Perl*Handlers" from mod_perl.pod
  
  * performance.pod: noted the DTWO_POT_OPTIMIZE and -DPACK_MALLOC Perl
    Options from perl5004delta.pod relevant to mod_perl
  
  * config.pod: wrote sections "PerlModule and PerlRequire directives",
    and Perl*Handlers.
  
  * install.pod: "skipping test on this platform" while 'make test'
    explained. (Doug)
  
  * warnings.pod: syntax error at /dev/null, explained (Doug)
  
  * started to work on intro.pod to make clear out the differences
    between Perl API, Apache::Registry, Apache::PerlRun.
  
  * install.pod: added "mod_auth_dbm nuances" an old notice from
    mod_perl_traps page
  
  * porting.pod: Added the explanation of why you cannot use C<__END__>
    or C<__DATA__> within C<Apache::Registry> scripts.
  
  * Removed the Cyan background from the postscript version of the
    guide. I liked the light grey background when the guide was printed
    on the B&W printer, but yes it uses too much toner - so it's gone :)
  
  08.17.99 ver 1.15
  
  * Richard A. Wells has kindly reviewed and corrected the following 
    pods: advocacy.pod download.pod snippets.pod status.pod
    browserbugs.pod intro.pod start.pod
  
  * security.pod : added "Forcing reauthenticating" section
  
  * index.pod : Added a link to http://www.perlreference.com/mod_perl/
  
  * help.pod : Added links to modperl.sourcegarden.org and
    http://www.perlreference.com/mod_perl/
  
  * performance.pod: a little fix to the crashme script (Jay J)
  
  * Updated the porting.pod sections: "Sometimes it wors, sometimes
    doesn't", "Script's name space" and other as well
  
  * config.pod: updated <Perl> sections (how to dump the sections (Eric
    Cholet) and how to use the /perl-status for doing that.
  
  * hardware.pod: David Landgren did a great job of reviewing,
    suggesting and correcting the OS/Hardare chapter!
  
  * Andreas J. Koenig pointed out that it's unfair to mention eddieware
    without the others.. I agree Andreas! hardware.pod and download.pod
    were updated to point to "High-Availability Linux Project"
    site... Eddieware was removed :) 
  
  * download.pod: now guide hints on where to find Apache::Request
    (libapreq-x.xx.tar.gz) - on Philip Jacob request
  
  * config.pod: a few small typos (John Milton)
  
  * databases.pod: Matt Arnold pointed out a problem with
    connect_on_init if the database server is down. I've added a
    warning.
  
  * porting.pod: Cleared out the confusion with StatINC and @INC issue
  
  * perl.pod: Added a section that reveals the useful perldoc options
  
  * performance.pod: Added the explanation of the Apache::Leak example
    (Cliff Rayman)
  
  * databases.pod: Added the explanation of the "skipping connection
    cache during server startup", when the connection is attempted to be
    opened in the parent process. (Edmund Mergl)
  
  * debug.pod: started the "Debugging Core Dumps" item (Doug)
  
  * performance.pod: added the reference to Apache::RegistryBB, for
    those who want to save the little overhead of the stat() call that
    is being executed under Apache::Registry. (Doug)
  
  * modules.pod: added Apache::RegistryBB (Doug)
  
  * porting.pod: covered the issue of Apache/Work/Foo/Bar.pm collision
    with Apache/Work/Foo.pm if the former is being loaded first (Doug)
  
  * Apache::Leak considered to be non-friendly, added a reference to
    B::LexInfo (Doug)
  
  * porting.pod: "Passing ENV variables to CGI" added clarifications for
    %ENV setting/passing mechanism in mod_perl (Doug)
  
  * performance.pod: started a new subsection - shared memory (what, how
    much, where)
   
  * modules.pod and porting.pod: added an Apache::LogSTDERR module to
    solve the syslog problem(Doug)
  
  * porting.pod: Reloading handlers trick (Doug)
  
  07.3.99 ver 1.14
  
  * porting.pod: added "Exposing Apache::Registry secrets, closures,
    multiserver mode".
  
  * A complete review, which included corrections, verifications,
    extensions and clarifications was done to the following pods during
    the preparation of the tutorial for the 3rd apache conference:
        start.pod     
        intro.pod     
        porting.pod   
        performance.pod
        strategy.pod  
        scenario.pod  
        config.pod    
        install.pod   
        control.pod   
        databases.pod 
        multiuser.pod  
        help.pod      
  
  06.19.99 ver 1.13
  
  * While working on presentation discovered a wonderfull 'html2ps'
    utility (http://www.tdb.uu.se/~jan/html2psug.html) - so now we have
    a real mod_perl book in PostScript !!! (cross references aren't
    working yet)
  
  * hardware: added a reference to eddieware
  
  * performance.pod: extended the Apache::Resource section (Doug)
  
  * performance.pod: Added a reference to httperf benchmark tool.
  
  * I made many little changes all over the guide, while preparing a
    subset of material for the upcoming apache/perl conference tutorial
  
  * performance.pod: added some clarifications to "Preload Perl modules
    at server startup" section - regarding CGI::compile
  
  * advocacy.pod: A complete rewrite to communicate the ideas
    differently.  Now it displays a positive, motivational and
    concise perspective on the same ideas. (by Randy Harmon)
  
  * strategy.pod: modifications related to memory sharing with Apache,
    mod_proxy section (Ask Bjoern Hansen)
  
  * scenario.pod: Added the missing implementation of "Standalone
    mod_perl Enabled Apache Server"+configuration, which is temporarely
    located at the same chapter.
  
  * More pods have been purified by Steve Reppucci (performance.pod).
  
  06.05.99 ver 1.12
  
  * install.pod: added "Should I rebuild mod_perl if I have upgraded my
    perl?"
  
  * scenario.pod: explained the long termed bug with APACI_ARGS, csh vs. 
    sh issue.
  
  * databases.pod: added "mysql_use_result vs. mysql_store_result"
    (Michael Hall, Ken Williams, Vivek Khera)
  
  * config.pod: added "Logical grouping of Location, Directory and
    FilesMatch directives" (Daniel Koch)
  
  * config.pod: added "The confusion with use() clause in startup" file
    (Mike Fletcher)
  
  * config.pod: added "The confusion with defining globals in startup"
    file
  
  * performance.pod: extended the Devel::DProf notes with Apache::DProf
  
  * started a new advocacy.pod: mod_perl advocacy
  
  * install.pod: "Stripping apache to make it almost perl-server"
    (Jeffrey W. Baker, Randal L. Schwartz,Robin Berjon)
  
  * modules.pod, config.pod : added Doug's Apache::PerlVINC to set a
    different @INC perl location
  
  * install.pod: covered an installation problem of: "mod_perl.c is
    incompatible with this version of apache" (Doug)
  
  * databases.pod: added "Opening a connection with different
    parameters" (Jauder Ho, Edmund Mergl)
  
  * performance.pod. modules.pod - added Apache::GzipChain to cut down
    download times
  
  * debug.pod: added some snippets from Doug's replies showing strace
    and Devel::Peek in action
  
  * updated obvious.html#Reloading_only_specific_files - some code
    improvements (Ken Williams)
  
  * debug.pod: added "Debugging Signal Handlers ($SIG{FOO})" which
    covers the latest $SIG{ALRM} changes and Doug's Sys::Signal module
    to overcome the handler restore problem with other signals. (Doug)
  
  * strategy.pod: mod_proxy and http accell sections were extended by
    notes from Joshua Chamas.
  
  * warning.pod: noted a 'rwrite returned -1' fix in CVS version
  
  * obvious.pod: added "Additional reading references" to "my() scoped 
    variable in nested subroutines" including a pointer to an article by
    Mark-Jason Dominus about how Perl handles variables and namespaces,
    and the difference between `use vars' and `my'.
  
  * hardware.pod: applied some addition and changes.
  
  * debug.pod: added "Monitoring error_log file"
  
  * scenario.pod: added a complete definition of ProxyReceiveBufferSize,
    its buffering functionality.
  
  * More pods have been purified by Steve Reppucci (hardware.pod,
    strategy.pod). Thanks to Steve English speakers can read my
    scribbles as well :o) 
  
  05.17.99 ver 1.11
  
  * new hardware.pod: added a "Operating System and Hardware Demands"
    (Dean Fitz reviewed it and made lots of fixes!!! Thanks)
  
  * started a new security.pod "Protecting Your Site" to explain
    security hazards and to show some configuration setups and code
    snippets.
  
  * security.pod: explained the Authentication and Authorization terms
  
  * security.pod: "Non authenticated access for internal IPs, but
    authenticated by external IPs" (Eric Cholet)
  
  * scenario.pod: added "HTTP Authentication with 2 servers + proxy"
    (Mark Mills, Russell D. Weiss)
  
  * scenario.pod: added some DSO building notes (Guy K. McArthur)
  
  * porting.pod: added "Generating correct HTTP MIME Headers" as
    suggested by Alex Krohn
  
  * config.pod: added "Running 'apachectl configtest' or 'httpd -t'"
    (Doug)
  
  * porting.pod: added "Passing ENV variables to CGI" (Doug)
  
  * Updated: "Finding the line number the error/warning has been
    triggered at" at porting.pod
  
  * Added the info about ProxyReceiveBufferSize in scenario.pod,
    mod_proxy section. (Rauznitz Balazs)
  
  * added to config.pod: Configuration Security Concerns (Gunther
    Birznieks)
  
  * completely rewrote the start.pod (the English was horrible :(
  
  * updated help.pod with squid help URLs
  
  05.08.99 ver 1.10
  
  * control.pod: SUID start-up scripts (Lincoln Stein)
  
  * porting.pod: Forking subprocesses from mod_per (Philp Gwyn)
  
  * added to performance.pod: CGI.pm\'s object methods calls
    vs. function calls
  
  * new pod: browserbugs.pod - Workarounds for some known bugs in browsers.
    added: Preventing QUERY_STRING to get corrupted with &entity key
    names.
    added: IE 4.x does not re-post data to a non-port-80 URL
  
  * strategy.pod: updated notes about squid (Andreas J. Koenig)
  
  * strategy.pod and scenario.pod started the ProxyPass sections (Mark
    Mills, Ken Williams, Ask Bjoern Hansen)
  
  * wrote a code to validate a pod L<> directive, by first building a
    hash of all available achors and hash of all L<> directives, then
    reporting the broken links! This is cool! TomC will never accept the
    patch to his Pod2Html.pm :( So there is no broken links anymore,
    unless I forgot to run the checker :)
  
  * start.pod now contains an overview of the guide. The previous
    content migrated to install.pod and download.pod. Part of the
    scenario.pod moved to install.pod.
  
  * people still report problems with CSS I use, I made more tweaking by
    deleting almost all styles. Seems people are missing some basic
    fonts families and complaining about being unable to read the text.
  
  * strategy.pod: using thttpd instead of plain apache (Rauznitz Balazs)
  
  * scenario.pod: was splitted into strategy.pod and
    scenario.pod. strategy.pod now only talks about different
    approaches, while scenario.pod provides the building and
    configuration details. strategy.pod tries clearly to state the pros
    and cons of each approach (please review)
  
  * Introduced a new dbm.pod: mod_perl and dbm files (please review)
  
  * Introduced a new databases.pod: mod_perl and Relational Databases
    (please review)
  
  * The whole expanded table of contents now can be found in index.html
     - (index.html now being generated by script). Should make
    navigation much easier.
  
  * The last html sources file has gone, now all src files are pods. 
  
  * Improved search engines requirements: Extended <META
    NAME="Description"> and <META NAME="keywords">
  
  * Improved navigation : added Next, Main. Previous links.
  
  * Added another list archive (help.html):
    http://www.geocrawler.com/lists/3/web/182/0/ (Eric Cholet)
  
  * obvious.pod: "Setting environment variables for scripts called from
    CGI." (Lincoln Stein, Doug MacEachern)
  
  * extended the "Using $|=1 under mod_perl and better print()
    techniques" at performace.pod.
  
  04.19.99 ver 1.09 (1/2)
  
  * guide.tar.gz and guide-src.tar.gz were outdated, now they are synced
  
  * Fixed a huge number of typos (with help of speller :), I'm sure
    there are still many that speller didn't catch - guess people are
    regular to read badly written textbooks, since just a few told me
    about them :( If you spot such, please, do not hesitate and tell me!
  
  * Lupe Christoph suggested to apply changes to the main page. It's
    done. Also as suggested by Lupe linked the text "Writing Apache
    Modules with Perl and C" a link to http://www.modperl.com/ .
  
  * Numerous typos were spotted by Andreas J. Koenig and gave me an
    idea to run speller :)
  
  04.17.99 ver 1.09
  
  * added to warnings.pod: explained "incorrect line number reporting in
    error/warn log messages"
  
  * added to scenario.pod: clarification about 2 different config files
    in the 2 servers scenario (David Livingstone)
  
  * added to scenario.pod: started "mod_perl as DSO" section - almost
    empty yet :( anyone with DSO experience?
  
  * added to config.pod: started the mod_perl as DSO section
  
  * updated config.pod: added how $Apache::Registry::NameWithVirtualHost
    bug in older versions can be turned into a feature (Doug)
  
  * added to performance.pod : Memory sharing (Leslie Mikesell)
  
  * updated warning.pod: server reached MaxClients setting
  
  * updated performance.pod : MaxClients reached ( Nick Tonkin )
  
  * updated start.pod: "How can I tell whether mod_perl is really
    installed" - added httpd -l
  
  * modified scenario.pod: Made little changes to make the installation
    process less confusing (Pete Harlan)
  
  * obvious.pod: updated "Handling the server timeout cases" -
    $SIG{ALRM} to not restore the original underlying C handler. Pointed
    to try a Sys::Signal as a remedy (Doug)
  
  * new in multiuser.pod: ISPs providing mod_perl services - a fantasy
    or reality. (Notes from Mark Mills, Russell D. Weiss)
  
  * new in multiuser.pod: Virtual Hosts in the guide
  
  * new pod : multiuser.pod - mod_perl for ISPs. mod_perl and Virtual
    Hosts.
  
  * Added a link to the new book to the O'Reilly and Amazon.com sites.
  
  * debug.pod: added Apache::DB coverage
  
  * performance.pod: "Why you should not use $|=1 under mod_perl" (Doug,
    Randal)
  
  * debug.pod: "gdb says there are no debugging symbols" (Michael Hall)
  
  * config.pod: "the server no longer retrieves the DirectoryIndex files
    for a directory" (Andreas Grupp)
  
  * scenario.pod: added 'make test fails' when people use PREP_HTTPD=1
    or don't use DO_HTTPD=1 (Doug)
  
  * removed the 'Mini' part from the guide's name, since it's growned
    enough to be not called mini any more.
  
  * modules.pod: added Apache::Request
  
  * modules.pod: added Apache::DBI 
  
  * modules.pod: added Apache::Session (Jeffrey Baker)
  
  * new pod: modules.pod - to introduce Apache::* modules with small
   examples to rise curiosity to read the whole man page
  
  * new in scenario.pod: "mod_perl and proxy server" 
  
        Incentives 
          Squid proxy server in httpd accelerator mode 
                Running a squid and 2 webservers scenario 
                  Running a squid and 1 mod_perl apache server scenario
  
    (Reviewed and modified according to notes by
    Richard Dice, Andreas J. Koenig, Eric Cholet, Jeremy Bailin, David
    Landgren)
  
  * Added to scenario.pod: 'Publishing port numbers different from 80'
   (originally by Ken Williams, forwarded by Eric Strovink)
  
  * config.pod: new section "Configuring Apache + mod_perl with mod_macro"
    contributed entirely by Eric Cholet (I have edited it a bit :).
  
  
  04.03.99
  
  * Rewritten the CREDITS section of the intro.html. I hope I didn't miss
   anyone, if I did please tell. Lets feed the ego :)
  
  * The guide now looks much better with StyleSheets (Nathan Vonnahme)
  
  * added to porting.pod : Filehandlers and locks leakages (Ken Williams, Doug)
  
  * added to obvious.pod: Handling the server timeout cases (Doug)
  
  * created new pod: perl.pod to cover some too frequently asked pure
   perl questions: opened up with "Using global variables and sharing
  them between modules/packages"
  
  * Now the pod sources available online along with the resulting htmls
   and the scripts that generates them.
  
  * Added a summary of various mod_perl deploying schemas (1/1, 2/2, DSO
   and proxy). /scenario.html#More_mod_perl_deploying_schemas (Mark
   Mills)
  
  * created new frequent.pod for "Frequent mod_perl problems" as
   suggested by Eric Cholet, who said that problems like 'my() scoped
  variable in nested subroutines' come up so often on the list that
  should be stressed in the guide as one of the most important things to
  read/beware of. Since now it has only a few problems please suggest
  what other ones should go here.
  
  * obvious.pod rewritten : my() scoped variable in nested subroutines
   (Eric Cholet)
  
  * some typos fixes in intro.html, start.pod and scenario.pod (Garr
   Updegraff)
  
  * snippets.pod: Cookie handling code (Ed Park)
  
  * obvious.pod updated: Handling the 'User pressed Stop button'
   case. More hints (Eric Strovink) and apache 1.3.6 news (Henrique
   Pantarotto)
  
  * scenario.pod added : Is it possible to determine which options were
   given to modperl's Makefile.PL
  
  * More pods have been purified by Steve Reppucci (warning.pod,
   obvious.pod and porting.pod). He did so much work to make them
   readable, that I'm afraid to apply new changes to break all the
   beauty he made :) Thanks, Steve!
  
  03.15.99
  
  * Added a downloadable guide.tar.gz as someone requested
  
  * snippets.pod: Accessing variables from the caller's package (Ken
   Williams)
  
  * porting.pod: Redirecting mod_perl error_log messages to the browser
   - added an extensive example
  
  * control.pod: added hints - Preventing from modperl process to eat up
   all the disk's space, when it goes wild. (Andreas J. Koenig, Ulrich
   Pfeifer)
  
  * performance.pod: cleared out where one can get the 'ab' Apache
   Benchmark utility
  
  * warning.pod: covered - Evil things might happen when using
   PerlFreshRestart (Doug)
  
  * status.pod: covered - Compiled Registry Scripts section seems to be
   empty (Radu Greab)
  
  * warning.pod: covered - RegistryLoader: Cannot translate the URI...
  
  * scenario.pod: added a note: when using USE_APACI and APACHE_PREFIX,
   make install will run also the make install at Apache's source
   tree... (Doug)
  
  * debug.pod Getting some decent debug info when running under mod_perl
   (Doug)
  
  * ScriptAlias vs. Alias updated and explaned in config.pod. (Doug, Ask
   and Eric)
  
  * scenario.pod, intro.html, config.pod, control.pod and start.pod were
   purified by Steve Reppucci. Steve has fixed my incorrect English
   expressions and tenses, corrected some technical details! Enormous
   help, Steve! Thanks!
  
  If you see some incorrect English in the guide, don't hesitate to send
  an email to me. Thanks!
  
  01.22.99
  
  * new obvious.pod: Where do the warnings/errors go?
  
  * new index.html: added a search box
  
  * new snippets.pod: added error_log.pl script to fetch the latest logs
  from the server without telneting there
  
  * new snippets.pod: How to avoid printing the header more than once.
  
  * new snippets.pod: More on relative paths
  
  * upd start.pod: removed all 'latest version is', so the guide will
  not misguide people (Ken Williams)
  
  * upd config.html: removed redundant ;; (Ken Williams)
  
  * upd config.html: fixed the question/answer 'Is there a way to
  provide a different startup.pl file for each individual virtual'
  (Ken Williams)
  
  * upd help.html: a few links fixed (Peter Skov (UNIT))
  
  * upd porting.pod: CORE::exit vs Apache::exit section update (Doug)
  
  * upd scenario.pod: note about importance make clean execution,
  because of possible binary incompability (1.3.3 vs 1.3.4) (Doug)
  
  * upd porting.pod: switches -w, -T in the shebang line (Doug)
  
  * upd debug.pod: tracing the PerlRequire's and PerlModule's as they
  are loaded (Doug)
  
  * add config.pod: Sometimes script from one virtual host calls the
  script with the same path from the second virtual host (Doug)
  
  * add performance.pod: how can I find if my modperl scripts have memory leaks 
(and where). (Doug)
  
  * help.html: added a section for DBI help (Jeffrey W. Baker)
  
  12.28.98
  
  * Updated the "Client hit STOP or Netscrape bit it" section, with new
  warning "[modperl] caught SIGPIPE in process" for ver 1.17 (new
  Apache::SIG)
  
  * Richard A. Soderberg spotted a few problems with name anchors in
  start.html (pod converter doesn't resolve the problem correctly) and
  ScriptAlias typos at config.html. 
  
  * broken link to www-security-faq was spotted by Gunther Birznieks
  
  12.20.98
  
  * fixed: @INC vs %INC obvious.html#Using_Apache_StatINC (thank to Ken 
Williams)
  
  * new: Apache::SpeedLimit added to
    performance.html#Limiting_the_request_rate_speed_
  
  * modified: register_cleanup in Registry scripts (END{} blocks)
        based on the last week tread
  
  * new: obvious.html#Handling_the_User_pressed_Stop_
  
  * Found a bug in Pod::Html - it tries to convert HTTP::Foo alike
    tokens into hypertext link which breaks the code in the resulting
    html. Applied the patch to Pod::Html::VERSION 1.01
  
  1119a1120
  >        (?! :)                     # don't convert HTTP::Foo and alike
  
  
  * Discovered that the guide is being searchable thru the
    http://www.apache.org/search.html added a link to index.html
  
  * Extended the control|Log_Rotation section (+Script from Randal)
  
  * control: HUP vs TERM vs USR1. I have asked for validation of this
    section, but received none... Added a note about slowness of
    termination (Robin Berjon) and possible way to speed it up (Frank
    D. Cringle). Added a mneumonics => numbers for SIGs (Marshall Dudley)
  
  * added the missing USE_APACI=1 in start.html#Mod_Perl (Thanks to Tzvetan 
Stoyanov)
  
  
  12.13.98
  
  * covered warning: rwrite returned -1
  * covered warning: Client hit STOP or Netscrape bit it!
  * covered warning: Can't load '.../auto/DBI/DBI.so' for module DBI
  * covered porting: using format()
  * covered warning: child process 30388 did not exit, sending another SIGHUP
  * extended warning: Callback called exit
  
  All the above are based on the Doug's answers this weekend :)
  
  * new: config: Tuning MinSpareServers MaxSpareServers StartServers
    MaxClients MaxRequestsPerChild (actually a pointer to the next item)
  
  * new: performance: 
        Tuning the Apache's configuration variables for the best performance 
          Tuning with ab - ApacheBench 
          Tuning with crashme script 
          Choosing MaxClients 
          Choosing MaxRequestsPerChild 
          Choosing MinSpareServers, MaxSpareServers and StartServers 
          Summary of Benchmarking to tune all 5 parameters 
  
  ##########################################################################
  12.08.98
  
  * Lots of "little typos" fixed. Thanks to Evan A. Zacks, 
    Eric Cholet and Nancy Lin !
  
  * added a quote from DBI page, why $sth-rows; can't be used for rows 
counting. 
  
  * fixed obvious.html#Compiled_Regular_Expressions href at porting.html
    Thanks to Richard Dice!
  
  * lots of little changes and add ons...
  
  12.07.98
  
  * Run a spell check. ispell and WWWebster were quite helpful :)
  
  * Added Richard Dice's notes about ways to see whether or not mod_perl
    is actually compiled into the server and working. "check the
    error_log file" (installation)
  
  * Added 'Is it possible to install mod_perl without root access?'
    section into Server Installation (scenario) page.
  
  * Added Perrin Harkins and Jonathan Peterson's notes about
    apache/mod_perl/embperl/DBI vs IIS/ASP/ADO
  
  * Added a CHANGES file (this one)
  
  * Added an 'all in one page', suitable for printing. Currently it's
    just an ordered cat(). In the future it might change :)
  
  12.03.98
  
  * First Release
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/advocacy.pod
  
  Index: advocacy.pod
  ===================================================================
  =head1 NAME
  
  mod_perl Advocacy
  
  =head1 Thoughts about scalability and flexibility
  
  Your need for scalability and flexibility depends on what you need from
  your web site.  If you only want a simple guest book or database gateway
  with no feature headroom, you can get away with any
  EASY_AND_FAST_TO_DEVELOP_TOOL (Exchange, MS IIS, Lotus Notes, etc).
  
  Experience shows that you will soon want more functionality, at which
  point you'll discover the limitations of these "easy" tools.
  Gradually, your boss will ask for increasing functionality and at some
  point you'll realize that the tool lacks flexibility and/or
  scalability.  Then your boss will either buy another
  EASY_AND_FAST_TO_DEVELOP_WITH_TOOLS and repeat the process (with
  different unforseen problems), or you'll start investing time in
  learning how to use a powerful, flexible tool to make the long-term
  development cycle easier.
  
  If you and your company are serious about delivering flexible Internet
  functionality, do your homework.  Then urge your boss to invest a
  little extra time and resources in choosing the right tool for the
  job.  The extra quality and manageability of your site along with your
  ability to deliver new and improved functionality of high quality and
  in good time will prove the superiority of using solid flexible tools.
  
  =head1 The boss, the developer and advocacy
  
  Each developer has a boss who participates in the decision-making
  process.  Remember that the boss considers input from sales people,
  developers, the media and associates before handing down large
  decisions.  Of course, results count!  A sales brochure makes very
  little impact compared to a working demonstration, and demonstrations
  of company-specific and developer-specific results count for a lot!
  
  Personally, when I discovered mod_perl I did a lot of testing and
  coding at home and at work. Once I had a working heavy application, I
  came to my boss with two URLs - one for the plain CGI server and the
  other for the mod_perl-enabled server. It took about 30 secs for my
  boss to say: `Go with it'.  Of course since then I have had to provide
  all the support for other developers, which is why I took time to
  learn it in first place (and why this guide was created!).
  
  Chances are that if you've done your homework, learnt the tools and
  can deliver results, you'll have a successful project.  If you
  convince your boss to try a tool that you don't know very well, your
  results may suffer.  If your boss follows your development process
  closely and sees that your progress is much worse than expected, you
  might be told to "forget it" and mod_perl might not get a second
  chance.
  
  Advocacy is a great thing for the open-source software movement, but
  it's best done quietly until you have confidence that you can show
  productivity.  If you can demonstrate to your boss a heavy CGI which
  is running much faster under mod_perl, that may be a strong argument
  for further evaluation.  Your company may even sponsor a portion of
  your learning process.
  
  Learn the technology by working on sample projects.  Learn how to
  support yourself and learn how to get support from the community; then
  advocate your ideas to your boss.  Then you'll have the knowledge;
  your company will have the benefit; and mod_perl will have the
  reputation it deserves.
  
  =head1 A summary of perl/CGI discussion at slashdot.org
  
  Well, there was a nice discussion of merits of Perl in CGI world. 
  I took the time to summarize this thread, so here is what I've got:
  
  Perl Domination in CGI Programming?
  http://slashdot.org/askslashdot/99/10/20/1246241.shtml
  
  =over 4
  
  =item *
  
  Perl is cool and fun to code with.
  
  =item *
  
  Perl is very fast to develop with.
  
  =item *
  
  Perl is even faster to develop with if you know what CPAN is. :)
  
  =item *
  
  Math intensive code and other stuff which is faster in C/C++, can be
  plugged into Perl with XS/SWIG and may be used transparently by Perl
  programmers.
  
  =item *
  
  Most CGI applications do text processing, at which Perl excels
  
  =item *
  
  Forking and loading (unless the code is shared) of C/C++ CGI programs
  produces an overhead.
  
  =item *
  
  Except for Intranets, bandwidth is usually a bigger bottleneck than 
  Perl performance, although this might change in the future.
  
  =item *
  
  For database driven applications, the database itself is a bottleneck.  
  Lots of posts talk about latency vs throughput.
  
  =item *
  
  mod_perl, FastCGI, Velocigen and PerlEx all give good performance
  gains over plain mod_cgi.
  
  =item *
  
  Other light alternatives to Perl and its derivatives which have
  been mentioned: PHP, Python.
  
  =item *
  
  There were almost no voices from users of M$ and similar technologies,
  I guess that's because they don't read http://slashdot.org :)
  
  =item *
  
  Many said that in many people's minds: 'CGI' eq 'Perl'
  
  =back
  
  
  
  =cut
  
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/browserbugs.pod
  
  Index: browserbugs.pod
  ===================================================================
  =head1 NAME
  
  Workarounds for some known bugs in browsers.
  
  =head1 Preventing QUERY_STRING from getting corrupted because of &entity key 
names
  
  In a URL which contains a query string, if the string has multiple
  parts separated by ampersands and it contains a key named "reg", for
  example C<http://my.site.com/foo.pl?foo=bar&reg=foobar>, then some
  browsers will interpret C<&reg> as an SGML entity and encode it as
  C<&reg;>.  This will result in a corrupted C<QUERY_STRING>. If you
  encounter this problem, then either you should avoid using such keys
  or you should separate parameter pairs with C<;> instead of C<&>.
  C<CGI.pm>, C<Apache::Request> and C<$r-E<gt>args()> support a semicolon
  instead of an ampersand as a separator.  So your URI should look like
  this: C<http://my.site.com/foo.pl?foo=bar;reg=foobar>.
  
  Note that this is only an issue when you are building your own URLs
  with query strings.  It is not a problem when the URL is the result 
  of submitting a form because the browsers I<have> to get that right.
  
  
  =head1 IE 4.x does not re-post data to a non-port-80 URL
  
  One problem with publishing 8080 port numbers (or so I have been 
  told) is that IE 4.x has a bug when re-posting data to a non-port-80 
  URL.  It drops the port designator and uses port 80 anyway.
  
  See L<Publishing Port Numbers other than
  80|config/Publishing_Port_Numbers_other_th>.
  
  
  
  =cut
  
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/config.cfg
  
  Index: config.cfg
  ===================================================================
  use vars qw(@c);
  @c = (
        id => 'guide',
  
        title => "mod_perl guide",
  
        abstract => 'Deploying mod_perl technology to give rocket speed
                     to your CGI/Perl scripts.  ',
  
        chapters => [
            qw(
               intro.pod
               start.pod
               perl.pod
               install.pod
               config.pod
               control.pod
               strategy.pod
               scenario.pod
               porting.pod
               performance.pod
               frequent.pod
               troubleshooting.pod
               correct_headers.pod
               security.pod
               databases.pod
               dbm.pod
               multiuser.pod
               debug.pod
               browserbugs.pod
               modules.pod
               snippets.pod
               hardware.pod
               advocacy.pod
               help.pod
               download.pod
              ),
        ],
  
  );
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/config.pod
  
  Index: config.pod
  ===================================================================
  =head1 NAME
  
  mod_perl Configuration
  
  
  =head1 Server Configuration
  
  The next step after building and installing your new mod_perl enabled
  Apache server is to configure the server.  There are two separate parts
  to configure: Apache and mod_perl.  Each has its own set of directives.
  
  To configure your mod_perl enabled Apache server, the only file that
  you should need to edit is I<httpd.conf>.  By default, I<httpd.conf> is
  put into the I<conf> directory under the server root directory.  The
  default server root is I</usr/local/apache/> on many UNIX platforms,
  but within reason it can be any directory you choose.  If you are new
  to Apache and mod_perl, you will probably find it helpful to keep to
  the directory layouts we use in this Guide if you can.
  
  Apache versions 1.3.4 and later are distributed with the configuration
  directives in a single file -- I<httpd.conf>.  This Guide uses the
  same approach in its examples.  Prior to version 1.3.4, the default
  Apache installation used three configuration files -- I<httpd.conf>,
  I<srm.conf>, and I<access.conf>.  If you wish you can still use all
  three files, by setting the AccessConfig and ResourceConfig directives
  in I<httpd.conf>.  You will also see later on that we use other files,
  for example I<perl.conf> and I<startup.pl>.  This is just for our
  convenience, you could still do everything in I<httpd.conf> if you
  wished.
  
  
  
  =head1 Apache Configuration
  
  Apache configuration can be confusing.  To minimize the number of
  things that can go wrong, it can be a good idea first to configure
  Apache itself without mod_perl.  This will give you the confidence
  that it works and maybe that you have some idea how to configure it.
  
  There is a warning in the I<httpd.conf> distributed with Apache about
  simply editing I<httpd.conf> and running the server, without
  understanding all the implications.  This is another warning.
  Modifying the configuration file and adding new directives can
  introduce security problems, and have performance implications.
  
  The Apache distribution comes with an extensive configuration manual,
  and in addition each section of the distributed configuration file
  includes helpful comments explaining how every directive should be
  configured and what the defaults values are.
  
  If you haven't moved Apache's directories around, the installation
  program will have configured everything for you.  You can just start
  the server and test it.  To start the server use the C<apachectl>
  utility which comes bundled with the Apache distribution.  It resides
  in the same directory as C<httpd>, the Apache server itself.  Execute:
  
    /usr/local/apache/bin/apachectl start
  
  Now you can test the server, for example by accessing http://localhost
  from a browser running on the same host.
  
  
  =head2 Configuration Directives
  
  For a basic setup there are just a few things to configure.  If you
  have moved any directories you have to update them in I<httpd.conf>.
  There are many of them, here are just a couple of examples:
  
    ServerRoot   "/usr/local/apache"
    DocumentRoot "/home/httpd/docs"
  
  If you want to run it on a port other than port 80 edit the C<Port>
  directive:
  
    Port 8080
  
  You might want to change the user and group names the server will run
  under.  Note that if started as the I<root> user (which is generally
  the case), the parent process will continue to run as I<root>, but its
  children will run as the user and group you have specified.  For
  example:
  
    User httpd
    Group httpd
  
  There are many other directives that you might need to configure as
  well.  In addition to directives which take a single value there are
  whole sections of the configuration (such as the C<E<lt>DirectoryE<gt>> and
  C<E<lt>LocationE<gt>> sections) which apply only to certain areas of your Web
  space.  As mentioned earlier you will find them all in I<httpd.conf>.
  
  
  =head2  .htaccess files
  
  If there is a file with the name I<.htaccess> in any directory, Apache
  scans it for further configuration directives which it then applies
  only to that directory (and its subdirectories).  The name
  I<.htaccess> is confusing because it can contain any configuration
  directives, not just those related to access to resources.  You will
  not be surprised to find that a configuration directive can change the
  names of the files used in this way.
  
  Note that if there is a
  
    <Directory />
      AllowOverride None
    </Directory>
  
  directive in I<httpd.conf>, Apache will not try to look for
  I<.htaccess> at all.
  
  
  =head2 E<lt>DirectoryE<gt>, E<lt>LocationE<gt> and E<lt>FilesE<gt> Sections
  
  I'll explain just the basics of the C<E<lt>DirectoryE<gt>>, 
C<E<lt>LocationE<gt>> and
  C<E<lt>FilesE<gt>> sections.  Remember that there is more to know and the rest
  of the information is available in the Apache documentation.  The
  information I'll present here is just what is important for
  understanding the mod_perl configuration sections.
  
  Apache considers directories and files on your machine all to be
  resources.  For each resource you can determine a particular behaviour
  which will apply to every request for information from that particular
  resource.
  
  Obviously the directives in C<E<lt>DirectoryE<gt>> sections apply to specific
  directories on your host machine, and those in C<E<lt>FilesE<gt>> sections
  apply only to specific files (actually groups of files with names
  which have something in common).  In addition to these sections,
  Apache has the concept of a C<E<lt>LocationE<gt>>, which is also just a
  resource.  C<E<lt>LocationE<gt>> sections apply to specific URIs.  Locations
  are based at the document root, directories are based at the
  filesystem root.  For example, if you have the default server
  directory layout where the server root is I</usr/local/apache> and the
  document root is I</usr/local/apache/htdocs> then static files in the
  directory I</usr/local/apache/htdocs/pub> are in the location I</pub>.
  
  It is up to you to decide which directories on your host machine are
  mapped to which locations.  You should be careful how you do it,
  because the security of your server may be at stake.
  
  Locations do not necessarily have to refer to existing physical
  directories, but may refer to virtual resources which the server
  creates for the duration of a single browser request.  As you will
  see, this is often the case for a mod_perl server.
  
  When a browser asks for a resource from your server, Apache determines
  from its configuration whether or not to serve the request, whether to
  pass the request to another server, what (if any) authorization is
  required for access to the resource, and how to reply.  For any given
  resource, the various sections in your configuration may provide
  conflicting information.  For example you may have a C<E<lt>DirectoryE<gt>>
  section which tells Apache that authorization is required for access
  to the resource but you may have a C<E<lt>FilesE<gt>> section which says that
  it is not.  It is not always obvious which directive takes precedence
  in these cases.  This can be a trap for the unwary.
  
  
  =over
  
  =item * E<lt>Directory directoryPathE<gt> ... E<lt>/DirectoryE<gt>
  
  Can appear in server and virtual host configurations.
  
  C<E<lt>DirectoryE<gt>> and C<E<lt>/DirectoryE<gt>> are used to enclose a 
group of
  directives which will apply only to the named directory and
  sub-directories of that directory.  Any directive which is allowed in
  a directory context (see the Apache documentation) may be used.
  
  The path given in the C<E<lt>DirectoryE<gt>> directive is either the full path
  to a directory, or a wild-card string.  In a wild-card string, C<?>
  matches any single character, C<*> matches any sequence of characters,
  and C<[]> matches character ranges.  (This is similar to the shell's
  file globs.)  None of the wildcards will match a C</> character.
  For example:
  
     <Directory /home/httpd/docs>
       Options Indexes
     </Directory>
  
  If you want to use a regular expression to match then you should use
  the syntax C<E<lt>DirectoryMatch regexE<gt>> ... C<E<lt>/DirectoryMatchE<gt>>.
  
  If multiple (non-regular expression) directory sections match the
  directory (or its parents) containing a document, then the directives
  are applied in the order of shortest match first, interspersed with
  the directives from any I<.htaccess> files.  For example, with
  
       <Directory />
         AllowOverride None
       </Directory>
  
       <Directory /home/httpd/docs/*>
         AllowOverride FileInfo
       </Directory>
  
  for access to the document I</home/httpd/docs/index.html> the steps are:
  
  =over
  
  =item * Apply directive C<AllowOverride None> (disabling I<.htaccess>
  files).
  
  =item * Apply directive C<AllowOverride FileInfo> for directory
  I</home/httpd/docs/> (which now enables I<.htaccess> in
  I</home/httpd/docs/> and its sub-directories).
  
  =item * Apply any C<FileInfo> directives in
  I</home/httpd/docs/.htaccess>.
  
  =back
  
  
  
  =item * C<E<lt>Files filenameE<gt>> ... C<E<lt>/FilesE<gt>>
  
  Can appear in server and virtual host configurations, and I<.htaccess>
  files as well.
  
  The C<E<lt>FilesE<gt>> directive provides for access control by
  filename.  It is comparable to the C<E<lt>DirectoryE<gt>> and
  C<E<lt>LocationE<gt>> directives.  It should be closed with the
  C<E<lt>/FilesE<gt>> directive.  The directives given within this
  section will be applied to any object with a basename (last component
  of filename) matching the specified filename.
  
  C<E<lt>FilesE<gt>> sections are processed in the order they appear in
  the configuration file, after the C<E<lt>DirectoryE<gt>> sections and
  I<.htaccess> files are read, but before C<E<lt>LocationE<gt>>
  sections. Note that C<E<lt>FilesE<gt>> can be nested inside
  C<E<lt>DirectoryE<gt>> sections to restrict the portion of the
  filesystem they apply to. C<E<lt>FilesE<gt>> cannot be nested inside
  C<E<lt>LocationE<gt>> sections however.
  
  The filename argument should include a filename, or a wild-card
  string, where C<?> matches any single character, and C<*> matches any
  sequence of characters. Extended regular expressions can also be used,
  simply place a tilde character C<~> between the directive and the
  regular expression.  The regular expression should be in quotes.  The
  dollar symbol refers to the end of the string.  The pipe character
  indicates alternatives.  Special characters in extended regular
  expressions must escaped with a backslash.  For example:
  
     <Files ~ "\.(gif|jpe?g|png)$">
  
  would match most common Internet graphics formats.  Alternatively you
  can use the C<E<lt>FilesMatch regexE<gt>> ... C<E<lt>/FilesMatchE<gt>>
  syntax.
  
  =item * E<lt>Location URLE<gt> ... E<lt>/LocationE<gt>
  
  Can appear in server and virtual host configurations.
  
  The C<E<lt>LocationE<gt>> directive provides for access control by
  URL.  It is similar to the C<E<lt>DirectoryE<gt>> directive, and
  starts a section which is terminated with the C<E<lt>/LocationE<gt>>
  directive.
  
  C<E<lt>LocationE<gt>> sections are processed in the order they appear
  in the configuration file, after the C<E<lt>DirectoryE<gt>> sections,
  I<.htaccess> files and C<E<lt>FilesE<gt>> sections are read.
  
  The C<E<lt>LocationE<gt>> section is the directive that is used most
  often with mod_perl.
  
  URLs I<do not> have to refer to real directories or files within the
  filesystem at all, C<E<lt>LocationE<gt>> operates completely outside
  the filesystem.  Indeed it may sometimes be wise to ensure that
  C<E<lt>LocationE<gt>>s do not match real paths to avoid confusion.
  
  The URL may use wildcards.  In a wild-card string, C<?> matches any
  single character, and C<*> matches any sequences of characters, C<[]>
  groups characters to match. For regular expression matches use the
  C<E<lt>LocationMatch regexE<gt>> ... C<E<lt>/LocationMatchE<gt>>
  syntax.
  
  The C<E<lt>LocationE<gt>> functionality is especially useful when
  combined with the C<SetHandler> directive.  For example to enable
  status requests, but allow them only from browsers at I<example.com>,
  you might use:
  
    <Location /status>
      SetHandler server-status
      order deny,allow
      deny from all
      allow from .example.com
    </Location>
  
  
  =back
  
  
  
  
  =head2 How Directory, Location and Files Sections are Merged
  
  When configuring the server, it's important to understand the order in
  which the rules of each section apply to requests. The order of
  merging is:
  
  =over
  
  =item 1 C<E<lt>DirectoryE<gt>> (except regular expressions) and
  I<.htaccess> are processed simultaneously, with I<.htaccess>
  overriding C<E<lt>DirectoryE<gt>>
  
  =item 1 C<E<lt>DirectoryMatchE<gt>>, and C<E<lt>DirectoryE<gt>> with
  regular expressions
  
  =item 1 C<E<lt>FilesE<gt>> and C<E<lt>FilesMatchE<gt>> are processed
  simultaneously
  
  =item 1 C<E<lt>LocationE<gt>> and C<E<lt>LocationMatchE<gt>> are
  processed simultaneously
  
  =back
  
  
  Apart from C<E<lt>DirectoryE<gt>>, each group is processed in the
  order that it appears in the configuration files.
  C<E<lt>DirectoryE<gt>> (group 1 above) is processed in the order
  shortest directory component to longest.  If multiple
  C<E<lt>DirectoryE<gt>> sections apply to the same directory then they
  are processed in the configuration file order.
  
  Sections inside C<E<lt>VirtualHostE<gt>> sections are applied as if
  you were running several independent servers.  The directives inside
  C<E<lt>VirtualHostE<gt>> sections do not interact with each other.
  They are applied after first processing any sections outside the
  virtual host definition.  This allows virtual host configurations to
  override the main server configuration.
  
  Later sections override earlier ones.
  
  
  
  
  =head2 Sub-Grouping of E<lt>LocationE<gt>, E<lt>DirectoryE<gt> and 
E<lt>FilesE<gt> Sections
  
  Let's say that you want all files, except for a few of the files in a
  specific directory and below, to be handled in the same way.  For
  example if you want all the files in I</home/http/docs> to be served
  as plain files, but any files with ending I<.html> and I<.txt> to be
  processed by the content handler of your C<Apache::MyFilter> module.
  
    <Directory /home/httpd/docs>
      <FilesMatch "\.(html|txt)$">
        SetHandler perl-script
        PerlHandler Apache::MyFilter
      </FilesMatch>
    </Directory>
  
  Thus it is possible to embed sections inside sections to create
  subgroups which have their own distinct behavior.  Alternatively you
  could use a C<E<lt>FilesE<gt>> section inside an I<.htaccess> file.
  
  Note that you can't put C<E<lt>FilesE<gt>> or C<E<lt>FilesMatchE<gt>>
  sections inside a C<E<lt>LocationE<gt>> section, but you can put them
  inside a C<E<lt>DirectoryE<gt>> section.
  
  
  
  
  =head2 Options Directive
  
  Normally, if multiple C<Options> directives apply to a directory, then
  the most specific one is taken complete; the options are not merged.
  
  However if all the options on the C<Options> directive are preceded by
  a C<+> or C<-> symbol, the options are merged.  Any options preceded
  by C<+> are added to the options currently in force, and any options
  preceded by C<-> are removed.
  
  For example, without any C<+> and C<-> symbols:
  
    <Directory /home/httpd/docs>
      Options Indexes FollowSymLinks
    </Directory>
    <Directory /home/httpd/docs/shtml>
      Options Includes
    </Directory>
  
  then only C<Includes> will be set for the I</home/httpd/docs/shtml>
  directory.  However if the second C<Options> directive uses the C<+>
  and C<-> symbols:
  
    <Directory /home/httpd/docs>
      Options Indexes FollowSymLinks
    </Directory>
    <Directory /home/httpd/docs/shtml>
      Options +Includes -Indexes
    </Directory>
  
  then the options C<FollowSymLinks> and C<Includes> are set for the
  I</home/httpd/docs/shtml> directory.
  
  
  
  
  =head1 mod_perl Configuration
  
  When you have tested that the Apache server works on your machine,
  it's time to configure mod_perl.  Some of the configuration directives
  are already familiar to you, but mod_perl introduces a few new ones.
  
  It can be a good idea to keep all the mod_perl related configuration
  at the end of the configuration file, after the native Apache
  configuration directives.
  
  To ease maintenance and to simplify multiple server installations, the
  Apache/mod_perl configuration system allows you several alternative
  ways to keep your configuration directives in separate places.  The
  C<Include> directive in I<httpd.conf> allow you to include the
  contents of other files, just as if the information were all contained
  in I<httpd.conf>.  This is a feature of Apache itself. For example if
  you want all your mod_perl configuration to be placed in a separate
  file I<mod_perl.conf> you can do that by adding to I<httpd.conf> this
  directive:
  
    Include conf/mod_perl.conf
  
  mod_perl adds two further directives: C<E<lt>PerlE<gt>> sections allow
  you to execute Perl code from within any configuration file at server
  startup time, and as you will see later, a file containing any Perl
  program can be executed (also at server startup time) simply by
  mentioning its name in a C<PerlRequire> or C<PerlModule> directive.
  
  
  =head2 Alias Configurations
  
  The C<ScriptAlias> and C<Alias> directives provide a mapping of a URI
  to a file system directory.  The directive:
  
    Alias /foo /home/httpd/foo
  
  will map all requests starting with I</foo> onto the files starting
  with I</home/httpd/foo/>. So when Apache gets a request
  http://www.example.com/foo/test.pl the server will map this into the
  file I<test.pl> in the directory I</home/httpd/foo/>.
  
  In addition C<ScriptAlias> assigns all the requests that match the URI
  (i.e. I</cgi-bin>) to be executed under mod_cgi.
  
    ScriptAlias /cgi-bin /home/httpd/cgi-bin
  
  is actually the same as: 
  
    Alias /cgi-bin /home/httpd/cgi-bin
    <Location /cgi-bin>
      SetHandler cgi-script
    </Location>
  
  where latter directive invokes mod_cgi. You shouldn't use the
  C<ScriptAlias> directive unless you want the request to be processed
  under mod_cgi. Therefore when you configure mod_perl sections use
  C<Alias> instead.
  
  Under mod_perl the C<Alias> directive will be followed by two further
  directives.  The first is the S<SetHandler perl-script> directive,
  which tells Apache to invoke mod_perl to run the script.  The second
  directive (for example C<PerlHandler>) tells mod_perl which handler
  (Perl module) the script should be run under, and hence for which
  phase of the request.  Refer to the section
  L<Perl*Handlers|config/Perl_Handlers> for more information about
  handlers for the various request phases.
  
  When you have decided which methods to use to run your scripts and
  where you will keep them, you can add the configuration directive(s)
  to I<httpd.conf>.  They will look like those below, but they will of
  course reflect the locations of your scripts in your file-system and
  the decisions you have made about how to run the scripts:
  
    ScriptAlias /cgi-bin/ /home/httpd/cgi-bin/
    Alias       /perl/    /home/httpd/perl/
  
  In the examples above all the requests issued for URIs starting with
  I</cgi-bin> will be served from the directory I</home/httpd/cgi-bin/>,
  and starting with I</perl> from the directory I</home/httpd/perl/>.
  
  =head3 Running CGI, PerlRun, and Registry Scripts Located in the Same 
Directory
  
    # Typical for plain cgi scripts:
    ScriptAlias /cgi-bin/  /home/httpd/perl/
      
    # Typical for Apache::Registry scripts:
    Alias       /perl/     /home/httpd/perl/
      
    # Typical for Apache::PerlRun scripts:
    Alias       /cgi-perl/ /home/httpd/perl/
  
  In the examples above we have mapped the three different URIs
  (I<http://www.example.com/perl/test.pl>
  I<http://www.example.com/cgi-bin/test.pl> and
  I<http://www.example.com/cgi-perl/test.pl>) all to the same file
  I</home/httpd/perl/test.pl>.  This means that we can have all our CGI
  scripts located at the same place in the file-system, and call the
  script in any of three ways simply by changing one component of the
  URI (I<cgi-bin|perl|cgi-perl>).
  
  This technique makes it easy to migrate your scripts to mod_perl.  If
  your script does not seem to be working while running under mod_perl,
  then in most cases you can easily call the script in straight mod_cgi
  mode or under C<Apache::PerlRun> without making any script changes.
  Simply change the URL you use to invoke it.
  
  Although in the configuration above we have configured all three
  I<Aliases> to point to the same directory within our file system, you
  can of course have them point to different directories if you prefer.
  
  You should remember that it is undesirable to run scripts in plain
  mod_cgi mode from a mod_perl-enabled server--the resource consumption
  is too high. It is better to run these on a plain Apache server.  See
  L<Standalone mod_perl Enabled Apache
  Server|strategy/Standalone_mod_perl_Enabled_Apac>.
  
  
  
  =head2 E<lt>LocationE<gt> Configuration
  
  The C<E<lt>LocationE<gt>> section assigns a number of rules which the
  server should follow when the request's URI matches the
  I<Location>. Just as it is the widely accepted convention to use
  I</cgi-bin> for your mod_cgi scripts, it is conventional to use
  I</perl> as the base URI of the perl scripts which you are running
  under mod_perl.  Let's review the following very widely used
  C<E<lt>LocationE<gt>> section:
  
    Alias /perl/ /home/httpd/perl/
    PerlModule Apache::Registry
    <Location /perl>
      SetHandler perl-script
      PerlHandler Apache::Registry
      Options ExecCGI
      allow from all
      PerlSendHeader On
    </Location>
  
  This configuration causes all requests for URIs starting with I</perl>
  to be handled by the mod_perl Apache module with the handler from the
  C<Apache::Registry> Perl module.  Let's review the directives inside
  the C<E<lt>LocationE<gt>> section in the example:
  
    <Location /perl>
  
  Remember the C<Alias> from the above section? We use the same C<Alias>
  here; if you were to use a C<E<lt>LocationE<gt>> that does not have
  the same C<Alias>, the server would fail to locate the script in the
  file system.  You need the C<Alias> setting only if the code that
  should be executed is located in the file.  So C<Alias> just provides
  the URI to filepath translation rule.
  
  Sometimes there is no script to be executed. Instead there is some
  module whose method is being executed, similar to I</perl-status>,
  where the code is stored in an Apache module. In such cases we don't
  need C<Alias> settings for those C<E<lt>LocationE<gt>>s.
  
    SetHandler perl-script
  
  This assigns the mod_perl Apache module to handle the content
  generation phase.
  
    PerlHandler Apache::Registry
  
  Here we tell Apache to use the C<Apache::Registry> Perl module for the
  actual content generation.
  
    Options ExecCGI
  
  The C<Options> directive accepts various parameters (options), one of
  which is C<ExecCGI>.  This tells the server that the file is a program
  and should be executed, instead of just being displayed like a static
  file (like HTML file).  If you omit this option then the script will
  either be rendered as plain text or else it will trigger a I<Save-As>
  dialog, depending on the client's configuration.
  
    allow from all
  
  This directive is used to set access control based on domain.  The
  above settings allow clients from any domain to run the script.
  
    PerlSendHeader On
  
  C<PerlSendHeader On> tells the server to send an HTTP headers to the
  browser on every script invocation.  You will want to turn this off
  for nph (non-parsed-headers) scripts.
  
  The C<PerlSendHeader On> setting invokes the Apache's
  C<ap_send_http_header()> method after parsing the headers generated by
  the script.  It is only meant for emulation of mod_cgi behavior with
  regard to headers.
  
  To send the HTTP headers it's always better either to use the
  C<$r-E<gt>send_http_header> method using the Apache Perl API or to use
  the C<$q-E<gt>header> method from the C<CGI.pm> module.
  
    </Location>
  
  Closes the C<E<lt>LocationE<gt>> section definition.
  
  Note that sometimes you will have to preload the module before using
  it in the C<E<lt>LocationE<gt>> section.  In the case of
  C<Apache::Registry> the configuration will look like this:
  
    PerlModule Apache::Registry
    <Location /perl>
      SetHandler perl-script
      PerlHandler Apache::Registry
      Options ExecCGI
      allow from all
      PerlSendHeader On
    </Location>
  
  C<PerlModule> is equivalent to Perl's native C<use()> function call.
  
  No changes are required to the I</cgi-bin> location (mod_cgi), since
  it has nothing to do with mod_perl.
  
  Here is another very similar example, this time using
  C<Apache::PerlRun> (For more information see
  L<Apache::PerlRun|porting/Apache__PerlRun__a_closer_look>):
  
    <Location /cgi-perl>
      SetHandler perl-script
      PerlHandler Apache::PerlRun
      Options ExecCGI
      allow from all
      PerlSendHeader On
    </Location>
  
  
  The only difference from the C<Apache::Registry> configuration is the
  argument of the C<PerlHandler> directive, where C<Apache::Registry>
  has been replaced with C<Apache::PerlRun>.
  
  
  =head2 Overriding E<lt>LocationE<gt> Setting in "Sub-Location"
  
  So if you have:
  
    <Location /foo>
      SetHandler perl-script
      PerlHandler My::Module
     </Location>
  
  If you want to remove a mod_perl handler setting from a location
  beneath a location where the handler was set (i.e. I</foo/bar>), all
  you have to do is to reset it, like this:
  
    <Location /foo/bar>
      SetHandler default-handler
    </Location>
  
  Now, all the requests starting with I</foo/bar> would be served by
  Apache's default handler.
  
  
  
  =head2 PerlModule and PerlRequire Directives
  
  As we saw earlier, a module should be loaded before it is used.
  C<PerlModule> and C<PerlRequire> are the two mod_perl directives which
  are used to load modules and code.  They are almost equivalent to
  Perl's C<use()> and C<require()> functions respectively and called
  from the Apache configuration file. You can pass one or more module
  names as arguments to C<PerlModule>:
  
      PerlModule Apache::DBI CGI DBD::Mysql
  
  Generally the modules are preloaded from the startup script, which is
  usually called I<startup.pl>.  This is a file containing plain Perl
  code which is executed through the C<PerlRequire> directive.  For
  example:
  
      PerlRequire  /home/httpd/perl/lib/startup.pl
  
  A C<PerlRequire> file name can be absolute or relative to
  C<ServerRoot> or a path in C<@INC>.
  
  As with any file with Perl code that gets C<use()>'d or
  C<require()>'d, it must return a I<true> value.  To ensure that this
  happens don't forget to add C<1;> at the end of I<startup.pl>.
  
  
  
  
  
  
  =head2 Perl*Handlers
  
  As you probably know Apache traverses a loop for each HTTP request it
  receives.
  
  After you have compiled and installed mod_perl, your Apache mod_perl
  configuration directives tell Apache to invoke the module mod_perl as
  the handler for some request which it receives.  Although it could in
  fact handle all the phases of the request loop, usually it does not.
  You tell mod_perl which phases it is to handle (and so which to leave
  to other modules, or to the default Apache routines) by putting
  C<Perl*Handler> directives in the configuration files.
  
  Because you need the Perl interpreter to be present for your Perl
  script to do any processing at all, there is a slight difference
  between the way that you configure Perl and C handlers to handle parts
  of the request loop.  Ordinarily a C module is written, compiled and
  configured to hook into a specific phase of the request loop.  For a
  Perl handler you compile mod_perl itself to hook into the appropriate
  phases, as if it were to handle the phases itself.  Then you put
  C<Perl*Handler> directives in your configuration file to tell mod_perl
  that it is to pass the responsibility for handling that part of the
  request phase to your Perl module.
  
  mod_perl is an Apache module written in C.  As most programmers will
  only need to handle the response phase, in the default compilation
  most of the C<Perl*Handler>s are disabled.  When you configure the
  I<Makefile.PL> file for its compilation, you must specify whether or
  not you will want to handle parts of the request loop other than the
  usual content generation phase.  If so you need to specify which
  parts.  See the "L<Callback Hooks|install/Callback_Hooks>" section for
  how to do this.
  
  Apache specifies about eleven phases of the request loop, namely (and
  in order of processing): Post-Read-Request, URI Translation, Header
  Parsing, Access Control, Authentication, Authorization, MIME type
  checking, FixUp, Response (also known as the Content handling phase),
  Logging and finally Cleanup.  These are the stages of a request where
  the Apache API allows a module to step in and do something.  There is
  a dedicated C<Perl*Handler> for each of these stages plus a couple of
  others which don't correspond to parts of the request loop.
  
  We call them C<Perl*Handler> directives because the names of the many
  mod_perl handler directives for the various phases of the request loop
  all follow the same format.  The C<*> in C<Perl*Handler> is a
  placeholder to be replaced by something which identifies the phase to
  be handled.  For example C<PerlLogHandler> is a Perl Handler which
  (fairly obviously) handles the logging phase.  
  
  The slight exception is C<PerlHandler>, which you can think of as
  C<PerlResponseHandler>.  It is the content generation handler and so
  it is probably the one that you will use most frequently.  
  
  Note that it is mod_perl which recognizes these directives, and not
  Apache.  They are mod_perl directives, and an ordinary Apache does not
  recognize them.  If you get error messages about these directives
  being I<"perhaps mis-spelled"> it is a sure sign that the appropriate
  part of mod_perl (or the entire mod_perl module!)  is not present in
  your copy of Apache executable.
  
  The full list of C<Perl*Handler>s follows. They are in the order that
  they are processed by Apache and mod_perl:
  
      PerlChildInitHandler
      PerlPostReadRequestHandler
      PerlInitHandler
      PerlTransHandler
      PerlHeaderParserHandler
      PerlAccessHandler
      PerlAuthenHandler
      PerlAuthzHandler
      PerlTypeHandler
      PerlFixupHandler
      PerlHandler
      PerlLogHandler
      PerlCleanupHandler
      PerlChildExitHandler
      PerlDispatchHandler
      PerlRestartHandler
  
  
  C<PerlChildInitHandler> and C<PerlChildExitHandler> do not refer to
  parts of the request loop, they are to allow your modules to
  initialize data structures and to clean up at the child process
  start-up and shutdown respectively, for example by allocating and
  deallocating memory.
  
  All C<E<lt>LocationE<gt>>, C<E<lt>DirectoryE<gt>> and
  C<E<lt>FilesE<gt>> sections contain a physical path specification.
  Like C<PerlChildInitHandler> and C<PerlChildExitHandler>, the
  directives C<PerlPostReadRequestHandler> and C<PerlTransHandler>
  cannot be used in these sections, nor in I<.htaccess> files, because
  it is not until the end of the Translation Handler
  (C<PerlTransHandler>) phase that the path translation is completed and
  a physical path is known.
  
  C<PerlInitHandler> changes its behaviour depending upon where it is
  used.  In any case it is the first handler to be invoked in serving a
  request.  If found outside any C<E<lt>LocationE<gt>>,
  C<E<lt>DirectoryE<gt>> or C<E<lt>FilesE<gt>> section (at the top
  level), it is an alias for C<PerlPostReadRequestHandler>.  When inside
  any such section it is an alias for C<PerlHeaderParserHandler>.
  
  Starting from C<PerlHeaderParserHandler> the requested URI has been
  mapped to a physical server pathname, and thus it can be used to match
  a C<E<lt>LocationE<gt>>, C<E<lt>DirectoryE<gt>> or C<E<lt>FilesE<gt>>
  configuration section, or to look in a I<.htaccess> file if such a
  file exists in the specified directory in the translated path.
  
  C<PerlDispatchHandler> and C<PerlRestartHandler> do not correspond to
  parts of the Apache API, but allow you to fine-tune the mod_perl API.
  
  The Apache documentation will tell you all about these stages and what
  your modules can do.  By default, most of these hooks are disabled at
  compile time, see the"L<Callback Hooks|install/Callback_Hooks>"
  section for information on enabling them.
  
  
  =head2 The handler subroutine
  
  By default the mod_perl API expects a subroutine called C<handler()>
  to handle the request in the registered C<Perl*Handler> module.  Thus if
  your module implements this subroutine, you can register the handler
  with mod_perl like this:
  
    Perl*Handler Apache::Foo
  
  Replace C<Perl*Handler> with the name of a specific handler from the
  list given above.  mod_perl will preload the specified module for you.
  Please note that this approach will not preload the module at startup.
  To make sure it gets loaded you have three options: you can explicitly
  preload it with the C<PerlModule> directive:
  
    PerlModule Apache::Foo
  
  You can preload it at the startup file:
  
    use Apache::Foo ();
  
  Or you can use a nice shortcut that the C<Perl*Handler> syntax
  provides:
  
    Perl*Handler +Apache::Foo
  
  Note the leading C<+> character.  This directive is equivalent to:
  
    PerlModule Apache::Foo
    Perl*Handler Apache::Foo
  
  If you decide to give the handler routine a name other than
  C<handler>, for example C<my_handler>, you must preload the module and
  explicitly give the name of the handler subroutine:
  
    PerlModule Apache::Foo
    Perl*Handler Apache::Foo::my_handler
  
  As you have seen, this will preload the module at server startup.
  
  If a module needs to know which handler is currently being run, it can
  find out with the I<current_callback> method.  This method is most
  useful to I<PerlDispatchHandlers> which wish to take action for
  certain phases only.
  
    if($r->current_callback eq "PerlLogHandler") {
        $r->warn("Logging request");
    }
  
  
  
  
  
  
  
  =head2 Stacked Handlers
  
  With the mod_perl stacked handlers mechanism, during any stage of a
  request it is possible for more than one C<Perl*Handler> to be defined
  and run.
  
  C<Perl*Handler> directives (in your configuration files) can define
  any number of subroutines.  For example:
  
    PerlTransHandler OneTrans TwoTrans RedTrans BlueTrans
  
  With the method C<Apache-E<gt>push_handlers()>, callbacks (handlers)
  can be added to a stack I<at runtime> by mod_perl scripts.
  
  C<Apache-E<gt>push_handlers()> takes the callback hook name as its
  first argument and a subroutine name or reference as its second.
  
  Here's an example:
  
    use Apache::Constants qw(:common);
    sub my_logger {
      #some code here
      return OK;
    }
    Apache->push_handlers("PerlLogHandler", \&my_logger);
  
  Here's another one:
  
    use Apache::Constants qw(:common);
    $r->push_handlers("PerlLogHandler", sub {
        print STDERR "__ANON__ called\n";
        return OK;
    });
  
  After each request, this stack is erased.
  
  All handlers will be called unless a handler returns a status other
  than C<OK> or C<DECLINED>.
  
  Example uses:
  
  C<CGI.pm> maintains a global object for its plain function interface.
  Since the object is global, it does not go out of scope, C<DESTROY> is
  never called.  C<CGI-E<gt>new> can call:
  
    Apache->push_handlers("PerlCleanupHandler", \&CGI::_reset_globals);
  
  This function will be called during the final stage of a request,
  refreshing C<CGI.pm>'s globals before the next request comes in.
  
  C<Apache::DCELogin> establishes a DCE login context which must exist
  for the lifetime of a request, so the C<DCE::Login> object is stored
  in a global variable.  Without stacked handlers, users must set
  
    PerlCleanupHandler Apache::DCELogin::purge
  
  in the configuration files to destroy the context.  This is not
  "user-friendly".  Now, C<Apache::DCELogin::handler> can call:
  
    Apache->push_handlers("PerlCleanupHandler", \&purge);
  
  Persistent database connection modules such as C<Apache::DBI> could
  push a C<PerlCleanupHandler> handler that iterates over C<%Connected>,
  refreshing connections or just checking that connections have not gone
  stale.  Remember, by the time we get to C<PerlCleanupHandler>, the
  client has what it wants and has gone away, so we can spend as much
  time as we want here without slowing down response time to the client
  (although the process itself is unavailable for serving new requests
  before the operation is completed).
  
  C<PerlTransHandlers> (e.g. C<Apache::MsqlProxy>) may decide, based on
  the URI or some arbitrary condition, whether or not to handle a
  request.  Without stacked handlers, users must configure it
  themselves:
  
   PerlTransHandler Apache::MsqlProxy::translate
   PerlHandler      Apache::MsqlProxy
  
  C<PerlHandler> is never actually invoked unless C<translate()> sees
  that the request is a proxy request (C<$r-E<gt>proxyreq>).  If it is a
  proxy request, C<translate()> sets C<$r-E<gt>handler("perl-script")>,
  and only then will C<PerlHandler> handle the request.  Now users do
  not have to specify C<PerlHandler Apache::MsqlProxy>, the
  C<translate()> function can set it with C<push_handlers()>.
  
  Imagine that you want to include footers, headers, etc., piecing
  together a document, without using SSI. The following example shows
  how to implement it. First we prepare the code as follows:
  
    Test/Compose.pm
    ---------------
    package Test::Compose;
    use Apache::Constants qw(:common);
    
    sub header {
       my $r = shift;
       $r->content_type("text/plain");
       $r->send_http_header;
       $r->print("header text\n");
       return OK;
    }
    sub body   { shift->print("body text\n")   ; return OK}
    sub footer { shift->print("footer text\n") ; return OK}
    1;
    __END__
  
    # in httpd.conf or perl.conf
    PerlModule Test::Compose
    <Location /foo>
       SetHandler "perl-script"
       PerlHandler Test::Compose::header Test::Compose::body 
Test::Compose::footer   
     </Location>
  
  Parsing the output of another PerlHandler?  This is a little more
  tricky, but consider:
  
    <Location /foo>
      SetHandler "perl-script"
      PerlHandler OutputParser SomeApp
    </Location>
    
    <Location /bar>
      SetHandler "perl-script"
      PerlHandler OutputParser AnotherApp
    </Location>
  
  Now, C<OutputParser> goes first, but it C<untie()>'s C<*STDOUT> and
  re-C<tie()>'s it to its own package like so:
  
    package OutputParser;
   
    sub handler {
        my $r = shift;
        untie *STDOUT;
        tie *STDOUT => 'OutputParser', $r;
     }
     
    sub TIEHANDLE {
        my($class, $r) = @_;
        bless { r => $r}, $class;
    }
    
    sub PRINT {
        my $self = shift;   
        for (@_) {
            #do whatever you want to $_ for example:
            $self->{r}->print($_ . "[insert stuff]");
        }
    }
    
    1;
    __END__
  
  To build in this feature, configure with:
  
    % perl Makefile.PL PERL_STACKED_HANDLERS=1 [ ... ]
  
  
  If you want to test whether your running mod_perl Apache can stack
  handlers, the method C<Apache-E<gt>can_stack_handlers> will return
  C<TRUE> if mod_perl was configured with C<PERL_STACKED_HANDLERS=1>,
  and C<FALSE> otherwise.
  
  
  
  
  
  
  =head2 Perl Method Handlers
  
  If a C<Perl*Handler> is prototyped with C<$$>, this handler will be
  invoked as a method.  For example:
  
    package My;
    @ISA = qw(BaseClass);
     
    sub handler ($$) {
        my($class, $r) = @_;
        ...;
    }
     
    package BaseClass;
     
    sub method ($$) {
        my($class, $r) = @_;
        ...;
    }
    
    1;
    __END__
  
  Configuration:
  
   PerlHandler My
  
  or
  
   PerlHandler My->handler
  
  Since the handler is invoked as a method, it may inherit from other
  classes:
  
   PerlHandler My->method
  
  META: requires more explanation!
  
  In this case, the C<My> class inherits this method from C<BaseClass>.
  
  To build in this feature, configure with:
  
   % perl Makefile.PL PERL_METHOD_HANDLERS=1 [ ... ]
  
  
  
  
  
  
  
  
  
  
  
  
  =head2 PerlFreshRestart
  
  To reload C<PerlRequire>, C<PerlModule> and other C<use()>'d modules,
  and to flush the C<Apache::Registry> cache on server restart, add to 
  I<httpd.conf>:
  
    PerlFreshRestart On
  
  Make sure you read L<Evil things might happen when using
  PerlFreshRestart|troubleshooting/Evil_things_might_happen_when_us>.
  
  Starting from mod_perl version 1.22 C<PerlFreshRestart> is ignored
  when mod_perl is compiled as a DSO.  But it almost doesn't matter,
  since mod_perl as a DSO will do a full tear-down (perl_destruct()).
  So it's still a I<FreshRestart>, just fresher than static (non-DSO)
  mod_perl :)
  
  But note that even if you have
  
    PerlFreshRestart No
  
  and mod_perl as a DSO you will still get a I<FreshRestart>.
  
  
  
  
  =head2 PerlSetVar, PerlSetEnv and PerlPassEnv
  
    PerlSetEnv key val
    PerlPassEnv key
  
  C<PerlPassEnv> passes, C<PerlSetEnv> sets and passes I<ENVironment>
  variables to your scripts.  You can access them in your scripts
  through C<%ENV> (e.g. C<$ENV{"key"}>).
  
  Regarding the setting of C<PerlPassEnv PERL5LIB> in I<httpd.conf>: if
  you turn on taint checks (C<PerlTaintCheck On>), C<$ENV{PERL5LIB}>
  will be ignored (unset).  See the 'L<Switches -w,
  -T|porting/Command_Line_Switches_w_T_e>' section.
  
  C<PerlSetVar> is very similar to C<PerlSetEnv>, but you extract it
  with another method.
  
    PerlSetVar foo bar
  
  or
  
    push @{ $Location{"/"}->{PerlSetVar} }, [ foo => 'bar' ];
  
  and in the code you read it with:
  
    my $r = Apache->request;
    print $r->dir_config('foo');
  
  The above prints: 
  
    bar
  
  Note that you cannot do this:
  
    push @{ $Location{"/"}->{PerlSetVar} }, [ foo => \%bar ];
  
  All values are treated as strings, so you will get a stringified
  reference to a hash as a value (something which will look like
  "C<HASH(0x87a5108)>").  This cannot be turned back into a reference
  and therefore into the original hash upon retrieval.
  
  However you can use the C<PerlAddVar> directive to push more values
  into the variable, emulating arrays. For example:
  
    PerlSetVar foo bar
    PerlAddVar foo bar1
    PerlAddVar foo bar2
  
  or the equal:
  
    PerlAddVar foo bar
    PerlAddVar foo bar1
    PerlAddVar foo bar2
  
  To retrieve the values use the C<dir_config-E<gt>get()> method:
  
    my @foo = $r->dir_config->get('foo');
  
  or
  
    my %foo = $r->dir_config->get('foo');
  
  Make sure that you use an even number of elements if you store the
  retrieved values in a hash.
  
  While the Apache's C<SetEnv> and mod_perl's C<PerlSetEnv> apparently
  perform the same thing, the former doesn't happen until the fixup
  phase, the latter happens as soon as possible, so those variables are
  available before then, e.g. in C<PerlAuthenHandler> for
  C<$ENV{ORACLE_HOME}> (or another environment variable that you need in
  these early request processing stages).
  
  
  
  =head2 PerlSetupEnv
  
  See L<PerlSetupEnv Off|performance/PerlSetupEnv_Off>.
  
  
  
  
  
  =head2 PerlWarn and PerlTaintCheck
  
  For B<PerlWarn> and B<PerlTaintCheck> directives see the 'L<Switches
  -w, -T|porting/Command_Line_Switches_w_T_e>' section.
  
  
  
  
  =head2 MinSpareServers MaxSpareServers StartServers MaxClients 
MaxRequestsPerChild
  
  C<MinSpareServers>, C<MaxSpareServers>, C<StartServers> and
  C<MaxClients> are standard Apache configuration directives that
  control the number of servers that will be launched at server startup
  and kept alive during the server's operation.
  
  C<MaxRequestsPerChild> lets you specify the maximum number of requests
  which each child will be allowed to serve.  When a process has served
  C<MaxRequestsPerChild> requests the parent kills it and replaces it
  with a new one.  There may also be other reasons why a child is
  killed, so it does not mean that each child will in fact serve this
  many requests, only that it will not be allowed to serve more than
  that number.
  
  These five directives are very important for achieving the best
  performance from your server. The section ' L<Performance Tuning by
  Tweaking Apache
  Configuration|performance/Performance_Tuning_by_Tweaking_A>' provides
  all the details.
  
  
  
  
  
  =head1 The Startup File
  
  At server startup, before child processes are spawned to receive
  incoming requests, there is more that can be done than just preloading
  files.  You might want to register code that will initialize a
  database connection for each child when it is forked, tie read-only
  dbm files, etc.
  
  The I<startup.pl> file is an ideal place to put the code that should
  be executed when the server starts.  Once you have prepared the code,
  load it in I<httpd.conf> before the rest of the mod_perl configuration
  directives like this:
  
      PerlRequire  /home/httpd/perl/lib/startup.pl
  
  I must stress that all the code that is run at server initialization
  time is run with root privileges if you are executing it as the root
  user (which you have to do unless you choose to run the server on an
  unprivileged port, above 1024).  This means that anyone who has write
  access to a script or module that is loaded by C<PerlModule> or
  C<PerlRequire> effectively has root access to the system.  You might
  want to take a look at the new and experimental C<PerlOpmask>
  directive and C<PERL_OPMASK_DEFAULT> compile time option to try to
  disable some of the more dangerous operations.
  
  Since the startup file is a file written in plain Perl, one can
  validate its syntax with:
  
    % perl -c /home/httpd/perl/lib/startup.pl
  
  =head2  The Sample Startup File
  
  Let's look at a real world startup file:
  
    startup.pl
    ----------
    use strict;
    
    # Extend @INC if needed
    use lib qw(/dir/foo /dir/bar);
    
    # Make sure we are in a sane environment.
    $ENV{MOD_PERL} or die "not running under mod_perl!";
     
    # For things in the "/perl" URL
    use Apache::Registry;          
     
    # Load Perl modules of your choice here
    # This code is interpreted *once* when the server starts
    use LWP::UserAgent ();
    use Apache::DBI ();
    use DBI ();
    
    # Tell me more about warnings
    use Carp ();
    $SIG{__WARN__} = \&Carp::cluck;
    
    # Load CGI.pm and call its compile() method to precompile 
    # (but not to import) its autoloaded methods. 
    use CGI ();
    CGI->compile(':all');
    
    # Initialize the database connections for each child
    Apache::DBI->connect_on_init
    ("DBI:mysql:database=test;host=localhost",
     "user","password",
     {
      PrintError => 1, # warn() on errors
      RaiseError => 0, # don't die on error
      AutoCommit => 1, # commit executes immediately
     }
    );
  
  Now we'll review the code explaining why each line is used.
  
    use strict;
  
  This pragma is worth using in every script longer than half a dozen
  lines. It will save a lot of time and debugging later on.
  
    use lib qw(/dir/foo /dir/bar);
  
  The only chance to permanently modify C<@INC> before the server is
  started is with this command. Later the running code can modify
  C<@INC> just for the moment it C<require()>'s some file, and then
  C<@INC>'s value gets reset to what it was originally.
  
    $ENV{MOD_PERL} or die "not running under mod_perl!";
  
  A sanity check, if Apache/mod_perl wasn't properly built, the above
  code will abort the server startup.
  
    use Apache::Registry;          
    use LWP::UserAgent ();
    use Apache::DBI ();
    use DBI ();
  
  Preload the modules that get used by our Perl code serving the
  requests. Unless you need the symbols (variables and subroutines)
  exported by the modules you preload to accomplish something within
  the startup file, don't import them, since it's just a waste of
  startup time. Instead use the empty list C<()> to tell the C<import()>
  function not to import anything.
  
    use Carp ();
    $SIG{__WARN__} = \&Carp::cluck;
  
  This is a useful snippet to enable extended warnings logged in the
  error_log file. In addition to basic warnings, a trace of calls is
  added.  This makes the tracking of the potential problem a much easier
  task, since you know who called whom. For example, with normal
  warnings you might see:
  
    Use of uninitialized value at
        /usr/lib/perl5/site_perl/5.005/Apache/DBI.pm  line 110.
  
  but you have no idea where it was called from. When we use C<Carp> as
  shown above we might see:
  
    Use of uninitialized value at
              /usr/lib/perl5/site_perl/5.005/Apache/DBI.pm line 110.
        Apache::DBI::connect(undef, 'mydb::localhost', 'user',
           'passwd', 'HASH(0x87a5108)') called at
              /usr/lib/perl5/site_perl/5.005/i386-linux/DBI.pm line 382
        DBI::connect('DBI', 'DBI:mysql:mydb::localhost', 'user',
           'passwd', 'HASH(0x8375e4c)') called at
              /usr/lib/perl5/site_perl/5.005/Apache/DBI.pm line 36
        Apache::DBI::__ANON__('Apache=SCALAR(0x87a50c0)') called at 
              PerlChildInitHandler subroutine 
              `Apache::DBI::__ANON__' line 0
        eval {...} called at PerlChildInitHandler subroutine 
              `Apache::DBI::__ANON__' line 0
  
  we clearly see that the warning was triggered by eval()'uating the
  C<Apache::DBI::__ANON__> which called C<DBI::connect> (with the
  arguments that we see as well), which in turn called the
  C<Apache::DBI::connect> method. Now we know where to look for our
  problem.
  
    use CGI ();
    CGI->compile(':all');
  
  Some modules create their subroutines at run time to improve their
  load time. This helps when the module includes many subroutines, but
  only a few are actually used. C<CGI.pm> falls into this
  category. Since with mod_perl the module is loaded only once, it might
  be a good idea to precompile all or a part of its methods.
  
  C<CGI.pm>'s C<compile()> method performs this task. Notice that this is a
  proprietary function of this module, other modules can implement this
  feature or not and use this or some other name for this
  functionality. As with all modules we preload in the startup file, we
  don't import symbols from them as they will be lost when they go out
  of the file's scope.
  
  Note that starting with C<$CGI::VERSION> 2.46, the recommended method
  to precompile the code in C<CGI.pm> is:
  
    use CGI qw(-compile :all);
  
  But the old method is still available for backward compatibility.
  
  See also the 'L<Apache::Status -- Embedded interpreter status
  information|debug/Apache_Status_Embedded_Inter>' section.
  
  
  
  
  
  
  
  =head2 What Modules You Should Add to the Startup File and Why
  
  Every module loaded at server startup will be shared among the server
  children, saving a lot of RAM on your machine.  Usually I put most of
  the code I develop into modules and preload them.
  
  You can even preload your CGI script with C<Apache::RegistryLoader>
  (See L<Preload Perl modules at server
  startup|performance/Preloading_Perl_Modules_at_Serve>) and you can get
  the children to preopen their database connections with
  C<Apache::DBI>.
  
  
  
  
  
  =head2 The Confusion with use() in the Server Startup File
  
  Some people wonder why you need to duplicate the C<use()> clause in
  the startup file and in the script itself. The confusion arises due to
  misunderstanding the C<use()> function. C<use()> normally performs two
  operations, namely C<require()> and C<import()>, called within a
  C<BEGIN> block. See the section "L<use()|perl/use__>" for a detailed
  explanation of the use(), require() and import() functions.
  
  In the startup file we don't want to import any symbols since they
  will be lost when we leave the scope of the startup file anyway,
  i.e. they won't be visible to any of the child processes which run our
  mod_perl scripts. Instead we want to preload the module in the startup
  file and then import any symbols that we actually need in each script
  individually.
  
  Normally when we write C<use MyModule;>, C<use()> will both load the
  module and import its symbols; so for the startup file we write C<use
  MyModule ();> and the empty parentheses will ensure that the module is
  loaded but that no symbols are imported. Then in the actual mod_perl
  script we write C<use()> in the standard way, e.g. C<use MyModule;>.
  Since the module has already been preloaded, the only action taken is
  to import the symbols. For example in the startup file you write:
  
    use CGI ();
  
  since you probably don't need any symbols to be imported there. But in
  your code you would probably write:
  
    use CGI qw(:html);
  
  For example, if you have C<use()>'d C<Apache::Constants> in the
  startup file, it does not mean you can have the following handler:
  
    package MyModule;
    sub {
      my $r = shift;
      ## Cool stuff goes here
      return OK;
    }
    1;
  
  You would either need to add:
  
    use Apache::Constants qw( OK );
  
  Or use the fully qualified name:
  
    return Apache::Constants::OK;
  
  If you want to use the function interface without exporting the
  symbols, use fully qualified function names, e.g. C<CGI::param>. The
  same rule applies to variables, you can import variables and you can
  access them by their full name. e g. C<$My::Module::bar>. When you use
  the object oriented (method) interface you don't need to export the
  method symbols.
  
  Technically, you aren't required to supply the C<use()> statement in your
  (handler?) code if it was already loaded during server startup
  (i.e. by 'C<PerlRequire startup.pl>'). When writing your code,
  however, you should not assume the module code has been preloaded. In
  the future, you or someone else will revisit this code and will not
  understand how it is possible to use a module's methods without first
  loading the module itself.
  
  Read the C<Exporter> and C<perlmod> manpages for more information
  about C<import()>.
  
  
  
  
  
  
  
  
  
  =head1 Apache Configuration in Perl
  
  With C<E<lt>PerlE<gt>>...C<E<lt>/PerlE<gt>> sections, it is possible
  to configure your server entirely in Perl.
  
  =head2 Usage
  
  C<E<lt>PerlE<gt>> sections can contain I<any> and as much Perl code as
  you wish. These sections are compiled into a special package whose
  symbol table mod_perl can then walk and grind the names and values of
  Perl variables/structures through the Apache core configuration gears.
  Most of the configuration directives can be represented as scalars
  (C<$scalar>) or lists (C<@list>).  A C<@list> inside these sections is
  simply converted into a space delimited string for you. Here is an
  example:
  
     httpd.conf
    ------------
    <Perl>
    @PerlModule = qw(Mail::Send Devel::Peek);
    
    #run the server as whoever starts it
    $User  = getpwuid($>) || $>;
    $Group = getgrgid($)) || $); 
    
    $ServerAdmin = $User;
    
    </Perl>
  
  Block sections such as C<E<lt>LocationE<gt>>..C<E<lt>/LocationE<gt>>
  are represented in a C<%Location> hash, e.g.:
  
    <Perl>
    
    $Location{"/~dougm/"} = {
      AuthUserFile => '/tmp/htpasswd',
      AuthType => 'Basic',
      AuthName => 'test',
      DirectoryIndex => [qw(index.html index.htm)],
      Limit => {
        METHODS => 'GET POST',
        require => 'user dougm',
      },
    };
    
    </Perl>
  
  If an Apache directive can take two or three arguments you may push
  strings (the lowest number of arguments will be shifted off the
  C<@list>) or use an array reference to handle any number greater than
  the minimum for that directive:
  
    push @Redirect, "/foo", "http://www.foo.com/";;
    
    push @Redirect, "/imdb", "http://www.imdb.com/";;
    
    push @Redirect, [qw(temp "/here" "http://www.there.com";)];
  
  Other section counterparts include C<%VirtualHost>, C<%Directory> and
  C<%Files>.
  
  To pass all environment variables to the children with a single
  configuration directive, rather than listing each one via C<PassEnv>
  or C<PerlPassEnv>, a C<E<lt>PerlE<gt>> section could read in a file and:
  
    push @PerlPassEnv, [$key => $val];
  
  or
  
    Apache->httpd_conf("PerlPassEnv $key $val");
  
  These are somewhat simple examples, but they should give you the basic
  idea. You can mix in any Perl code you desire. See I<eg/httpd.conf.pl>
  and I<eg/perl_sections.txt> in the mod_perl distribution for more
  examples.
  
  Assume that you have a cluster of machines with similar configurations
  and only small distinctions between them: ideally you would want to
  maintain a single configuration file, but because the configurations
  aren't I<exactly> the same (e.g. the C<ServerName> directive) it's not
  quite that simple.
  
  C<E<lt>PerlE<gt>> sections come to rescue. Now you have a single
  configuration file and the full power of Perl to tweak the local
  configuration. For example to solve the problem of the C<ServerName>
  directive you might have this C<E<lt>PerlE<gt>> section:
  
    <Perl>
    $ServerName = `hostname`;
    </Perl>
  
  For example if you want to allow personal directories on all machines
  except the ones whose names start with I<secure>:
  
    <Perl>
    $ServerName = `hostname`;
    if ( $ServerName !~ /^secure/) {
      $UserDir = "public.html";
    } else {
      $UserDir = "DISABLED";
    }
    </Perl>
  
  
  Behind the scenes, mod_perl defines a package called
  C<Apache::ReadConfig>.  Here it keeps all the variables that you
  define inside the C<E<lt>PerlE<gt>> sections. Therefore it's not
  necessarily to configure the server within the C<E<lt>PerlE<gt>>
  sections. Actually what you can do is to write the Perl code to
  configure the server just like you'd do in the C<E<lt>PerlE<gt>>
  sections, but instead place it into a separate file that should be
  called during the configuration parsing with either C<PerlModule> or
  C<PerlRequire> directives, or from within the startup file. All you
  have to do is to declare the package C<Apache::ReadConfig> within this
  file. Using the last example:
  
    apache_config.pl
    ---------------- 
    package Apache::ReadConfig;
    
    $ServerName = `hostname`;
    if ( $ServerName !~ /^secure/) {
      $UserDir = "public.html";
    } else {
      $UserDir = "DISABLED";
    }
    
    1;
  
    httpd.conf
    ----------
    PerlRequire /home/httpd/perl/lib/apache_config.pl
  
  
  
  
  =head2 Enabling
  
  To enable C<E<lt>PerlE<gt>> sections you should build mod_perl with
  S<perl Makefile.PL PERL_SECTIONS=1 [ ... ]>.
  
  
  =head2 Caveats
  
  Be careful when you declare package names inside C<E<lt>PerlE<gt>>
  sections, for example this code has a problem:
  
    <Perl>  
      package My::Trans;
      use Apache::Constants qw(:common);
      sub handler{ OK }
      
      $PerlTransHandler = "My::Trans";
    </Perl>
  
  When you put code inside a C<E<lt>PerlE<gt>> section, by default it
  actually goes into the C<Apache::ReadConfig> package, which is already
  declared for you.  This means that the C<PerlTransHandler> we have
  tried to define above is actually undefined.  If you define a
  different package name within a C<E<lt>PerlE<gt>> section you must
  make sure to close the scope of that package and return to the
  C<Apache::ReadConfig> package when you want to define the
  configuration directives, like this:
  
    <Perl>  
      package My::Trans;
      use Apache::Constants qw(:common);
      sub handler{ OK }
      
      package Apache::ReadConfig;  
      $PerlTransHandler = "My::Trans";
    </Perl>
  
  
  
  =head2 Verifying
  
  This section shows how to check and dump the configuration you have
  made with the help of C<E<lt>PerlE<gt>> sections in I<httpd.conf>.
  
  To check the C<E<lt>PerlE<gt>> section syntax outside of httpd, we
  make it look like a Perl script:
  
    <Perl>
    # !perl
    # ... code here ...
    __END__
    </Perl>
  
  Now you may run:
  
    perl -cx httpd.conf
  
  In a running httpd you can see how you have configured the
  C<E<lt>PerlE<gt>> sections through the URI
  
L</perl-status|debug/Apache__Status____Embedded_Interpreter_Status_Information>,
 by choosing I<Perl
  Section Configuration> from the menu. In order to make this item show
  up in the menu you should set C<$Apache::Server::SaveConfig> to a true
  value. When you do that the I<Apache::ReadConfig> namespace (in which
  the configuration data is stored) will not be flushed, making
  configuration data available to Perl modules at request time.
  
  Example:
  
   <Perl>
   $Apache::Server::SaveConfig = 1;
  
   $DocumentRoot = ...
   ...
   </Perl>
  
  At request time, the value of B<$DocumentRoot> can be accessed with
  the fully qualified name B<$Apache::ReadConfig::DocumentRoot>.
  
  You can dump the configuration of C<E<lt>PerlE<gt>> sections like
  this:
  
    <Perl>
    use Apache::PerlSections();
    ...
    # Configuration Perl code here
    ...
    print STDERR Apache::PerlSections->dump();
    </Perl>
  
  Alternatively you can store it in a file:
  
    Apache::PerlSections->store("httpd_config.pl");
  
  You can then C<require()> that file in some other C<E<lt>PerlE<gt>>
  section.
  
  
  =head2 Strict C<E<lt>PerlE<gt>> Sections
  
  If the Perl code doesn't compile, the server won't start.  If the
  generated Apache config is invalid, C<E<lt>PerlE<gt>> sections have
  always just logged an error and carried on, since there might be
  globals in the section that are not intended for the config.
  
  The variable C<$Apache::Server::StrictPerlSections> has been added in
  mod_perl version 1.22.  If you set this variable to a true value, for
  example
  
    $Apache::Server::StrictPerlSections = 1;
  
  then mod_perl will not tolerate invalid Apache configuration syntax
  and will C<croak> (die) if this is the case. At the time of writing the
  default value is C<0>.
  
  
  =head2 Debugging
  
  If you compile mod_perl with C<PERL_TRACE=1> and set the environment
  variable L<MOD_PERL_TRACE|debug/Debug_Tracing> then you should see
  some useful diagnostics when mod_perl is processing C<E<lt>PerlE<gt>>
  sections.
  
  =head2 References
  
  For more info see
  http://www.modperl.com Chapter 8
  
  META: a direct link?
  
  
  
  =head1 Validating the Configuration Syntax
  
  C<apachectl configtest> tests the configuration file without starting
  the server. You can safely validate the configuration file on your
  production server, if you run this test before you restart the server
  with C<apachectl restart>.  Of course it is not 100% perfect, but it
  will reveal any syntax errors you might have made while editing the
  file.
  
  'C<apachectl configtest>' is the same as 'C<httpd -t>' and it doesn't
  just parse the code in I<startup.pl>, it actually executes it.
  C<E<lt>PerlE<gt>> configuration has always started Perl during the
  configuration read, and C<Perl{Require,Module}> do so as well.
  
  Of course we assume that the code that gets called during this test
  cannot cause any harm to your running production environment. The
  following hint shows how to prevent the code in the startup script and
  C<E<lt>PerlE<gt>> from being executed during the syntax check, if
  that's what you want.
  
  If you want your startup code to get control over the C<-t>
  (C<configtest>) server launch, start the server configuration test
  with:
  
    httpd -t -Dsyntax_check
  
  and, if for example you want to prevent your startup code from being
  executed, at the top of the code add:
  
    return if Apache->define('syntax_check');
  
  
  
  
  
  
  
  
  =head1 Enabling Remote Server Configuration Reports
  
  The nifty mod_info module displays the complete server configuration
  in your browser. In order to use it you have compile it in or, if the
  server was compiled with DSO mode enabled, load it as an object. Then
  just uncomment the ready-prepared section in the I<httpd.conf> file:
  
    <Location /server-info>
      SetHandler server-info
      Order deny,allow
      Deny from all
      Allow from www.example.com
    </Location>
  
  Now restart the server and issue the request:
  
    http://www.example.com/server-info
  
  
  
  
  =head1 Publishing Port Numbers other than 80
  
  If you are using a two-server setup, with a mod_perl server listening
  on a high port, it is advised that you do not publish the number of
  the high port number in URLs.  Rather use a proxying rewrite rule in
  the non-mod_perl server:
  
    RewriteEngine      On
    RewriteLogLevel    0
    RewriteRule       ^/perl/(.*) http://localhost:8080/perl/$1 [P]
    ProxyPassReverse   /          http://localhost/
  
  I was told one problem with publishing high port numbers is that IE
  4.x has a bug when re-posting data to a non-port-80 URL. It drops the
  port designator, and uses port 80 anyway.
  
  Another reason is that firewalls probably will have the high port
  closed, therefore users behind the firewalls will be unable to reach
  your service, running on the blocked port.
  
  
  
  
  
  
  
  =head1 Configuring Apache + mod_perl with mod_macro
  
  mod_macro is an Apache module written by Fabien Coelho that lets you
  define and use macros in the Apache configuration file.
  
  mod_macro can be really useful when you have many virtual hosts, and
  where each virtual host has a number of scripts/modules, most of them
  with a moderately complex configuration setup.
  
  First download the latest version of mod_macro from
  http://www.cri.ensmp.fr/~coelho/mod_macro/ , and configure your Apache
  server to use this module.
  
  Here are some useful macros for mod_perl users:
  
    # set up a registry script
    <Macro registry>
    SetHandler "perl-script"
    PerlHandler Apache::Registry
    Options +ExecCGI
    </Macro>
  
    # example
    Alias /stuff /usr/www/scripts/stuff
    <Location /stuff>
    Use registry
    </Location>
  
  If your registry scripts are all located in the same directory, and
  your aliasing rules consistent, you can use this macro:
  
    # set up a registry script for a specific location
    <Macro registry $location $script>
    Alias /$location /home/httpd/perl/scripts/$script
    <Location /$location>
    SetHandler "perl-script"
    PerlHandler Apache::Registry
    Options +ExecCGI
    </Location>
    </Macro>
  
    # example
    Use registry stuff stuff.pl
  
  
  If you're using content handlers packaged as modules, you can use the
  following macro:
  
    # set up a mod_perl content handler module
    <Macro modperl $module>
    SetHandler "perl-script"
    Options +ExecCGI
    PerlHandler $module
    </Macro>
    
    #examples
    <Location /perl-status>
    PerlSetVar StatusPeek On
    PerlSetVar StatusGraph On
    PerlSetVar StatusDumper On
    Use modperl Apache::Status
    </Location>
  
  The following macro sets up a Location for use with C<HTML::Embperl>.
  Here we define all ".html" files to be processed by C<Embperl>.
  
    <Macro embperl>
    SetHandler "perl-script"
    Options +ExecCGI
    PerlHandler HTML::Embperl
    PerlSetEnv EMBPERL_FILESMATCH \.html$
    </Macro>
    
    # examples
    <Location /mrtg>
    Use embperl
    </Location>
  
  Macros are also very useful for things that tend to be verbose, such
  as setting up Basic Authentication:
  
    # Sets up Basic Authentication
    <Macro BasicAuth $realm $group>
    Order deny,allow
    Satisfy any
    AuthType Basic
    AuthName $realm
    AuthGroupFile /usr/www/auth/groups
    AuthUserFile /usr/www/auth/users
    Require group $group
    Deny from all
    </Macro>
    
    # example of use
    <Location /stats>
    Use BasicAuth WebStats Admin
    </Location>
  
  Finally, here is a complete example that uses macros to set up simple
  virtual hosts.  It uses the C<BasicAuth> macro defined previously (yes,
  macros can be nested!).
  
    <Macro vhost $ip $domain $docroot $admingroup>
    <VirtualHost $ip>
    ServerAdmin [EMAIL PROTECTED]
    DocumentRoot /usr/www/htdocs/$docroot
    ServerName www.$domain
    <Location /stats>
    Use BasicAuth Stats-$domain $admingroup
    </Location>
    </VirtualHost>
    </Macro>
    
    # define some virtual hosts
    Use vhost 10.1.1.1 example.com example example-admin
    Use vhost 10.1.1.2 example.net examplenet examplenet-admin
  
  mod_macro is also useful in a non vhost setting. Some sites for
  example have lots of scripts which people use to view various
  statistics, email settings and etc. It is much easier to read things
  like:
  
    use /forwards email/showforwards
    use /webstats web/showstats
  
  The actual macros for the last example are left as an exercise to
  reader. These can be easily constructed based on the examples
  presented in this section.
  
  
  
  
  
  
  =head1 General Pitfalls
  
  
  
  
  =head2 My CGI/Perl Code Gets Returned as Plain Text Instead of Being Executed 
by the Webserver
  
  Check your configuration files and make sure that the C<ExecCGI> is
  turned on in your configurations.
  
    <Location /perl>
      SetHandler perl-script
      PerlHandler Apache::Registry
      Options ExecCGI
      allow from all
      PerlSendHeader On
    </Location>
  
  
  
  
  
  
  =head2 My Script Works under mod_cgi, but when Called via mod_perl I Get a 
'Save-As' Prompt
  
  Did you put B<PerlSendHeader On> in the configuration part of the
  C<E<lt>Location fooE<gt>E<lt>/LocationE<gt>>.
  
  
  
  
  
  
  =head2 Is There a Way to Provide a Different startup.pl File for Each 
Individual Virtual Host
  
  No. Any virtual host will be able to see the routines from a I<startup.pl>
  loaded for any other virtual host.  
  
  
  
  
  
  =head2 Is There a Way to Modify @INC on a Per-Virtual-Host or Per-Location 
Basis.
  
  You can use C<PerlSetEnv PERL5LIB ...> or a C<PerlFixupHandler> with
  the C<lib> pragma (C<use lib qw(...)>).
  
  A better way is to use
  
L<Apache::PerlVINC|modules/Apache__PerlVINC___Allows_Module_Versioning_in_Location_blocks_and_Virtual_Hosts>
  
  
  
  
  
  =head2 A Script From One Virtual Host Calls a Script with the Same Path From 
the Other Virtual Host
  
  This has been a bug before, last fixed in 1.15_01, i.e. if you are
  running 1.15, that could be the problem. You should set this variable
  in a startup file (which you load with C<PerlRequire> in
  I<httpd.conf>):
  
    $Apache::Registry::NameWithVirtualHost = 1;
  
  But, as we know sometimes a bug turns out to be a feature. If the same
  script is running for more than one Virtual host on the same machine,
  this can be a waste, right? Set it to C<0> in a startup script if you
  want to turn it off and have this bug as a feature. (Only makes sense
  if you are sure that there will be no I<other> scripts with the same
  path/name). It also saves you some memory as well.
  
    $Apache::Registry::NameWithVirtualHost = 0;
  
  
  
  
  
  
  
  =head2 the Server no Longer Retrieves the DirectoryIndex Files for a Directory
  
  The problem was reported by users who declared mod_perl configuration
  inside a C<E<lt>DirectoryE<gt>> section for all files matching *.pl. The
  problem went away after placing the directives in a C<E<lt>FilesE<gt>>
  section.
  
  The mod_alias and mod_rewrite are both Trans handlers in the normal
  case. So in the setup where both are used, if mod_alias runs first and
  matches it will return OK and mod_rewrite won't see the request.
  
  The opposite can happen as well, where mod_rewrite rules apply but the
  C<Alias> directives are completely ignored.
  
  The behavior is not random, but depends on the Apache modules loading
  order. Apache modules are being executed in I<reverse> order,
  i.e. module that was I<Added> first will be executed last.
  
  The solution is not to mix mod_rewrite and mod_alias.  mod_rewrite
  does everything mod_alias does--except for C<ScriptAlias> which is not
  really relevant to mod_perl anyway. Don't rely on the module ordering,
  but use explicitly disjoint URL namespaces for C<Alias> and
  C<Rewrite>. In other words any URL regex that can potentially match a
  C<Rewrite> rule should not be used in an C<Alias>, and vice
  versa. Given that mod_rewrite can easily do what mod_alias does, it's
  no problem.
  
  Here is one of the examples where C<Alias> is replaced with
  C<RedirectMatch>. This is a snippet of configuration at the light
  non-mod_perl Apache server:
  
    RewriteEngine     on
    RewriteLogLevel   0
    RewriteRule       ^/(perl.*)$    http://127.0.0.1:8045/$1  [P,L]
    RewriteRule       ^/(mail.*)$    http://127.0.0.1:8045/$1  [P,L]
    NoCache           *
    ProxyPassReverse  / http://www.example.com/
    
    RedirectMatch permanent ^/$      /pages/index
    RedirectMatch permanent ^/foo$   /pages/bar
  
  This configuration works fine because any URI that matches one of the
  redirects will never match one of the rewrite rules.
  
  In the above setup we proxy requests starting with I</perl> or
  I</mail> to the mod_perl server, forbid proxy requests to the external
  sites, and make sure that the proxied requests will use the
  I<http://www.example.com/> as their URL on the way back to the client.
  
  The C<RedirectMatch> settings work exactly like if you'd write:
  
    Alias /      /pages/index
    Alias /foo   /pages/bar
  
  But as we told before we don't want to mix the two.
  
  Here is another example where the redirect is done by a rewrite rule:
  
    RewriteEngine     on
    RewriteLogLevel   0
    RewriteMap        lowercase int:tolower
    RewriteRule       ^/(perl.*)$  http://127.0.0.1:8042/$1   [P,L]
    RewriteRule       ^/$           /pages/welcome.htm        [R=301,L]
    RewriteRule       ^(.*)$        ${lowercase:$1}
    NoCache           *
    ProxyPassReverse  /  http://www.example.com/
  
  If we omit the rewrite rule that matches C<^/$>, and instead use a
  redirect, it will never be called, because the URL is still matched by
  the last rule C<^(.*)$>. This is a somewhat contrived example because
  that last regex could be rewritten as C<^(/.+)$> and all would be
  well.
  
  
  
  
  
  =head1 Configuration Security Concerns
  
  It is better not to advertise the port that mod_perl server uses to
  the outside world, for it creates a potential security risk by
  revealing which module(s) and/or OS you are running your web server
  on.
  
  For more information see L<Publishing Port Numbers other than
  80|config/Publishing_Port_Numbers_other_th>.
  
  The more modules you have in your web server, the more complex the
  code.
  
  The more complex the code in your web server, the more chances for
  bugs.
  
  The more chances for bugs, the more chance that some of those bugs may
  involve security.
  
  We never were completely sure why the default of the C<ServerTokens>
  directive in Apache is C<Full> rather than C<Minimal>. Seems like you
  would only make it C<Full> if you are debugging. Probably the reason
  for using the C<ServerTokens Full> is for a show-off, so NetCraft
  (http://netcraft.com) and other similar survey services will count
  more Apache servers, which is good for all of us, but you really want
  to reveal as little information as possible to the potential crackers.
  
  Another approach is to modify httpd sources to reveal no unwanted
  information, so all responses will return an empty or phony C<Server:>
  field.
  
  From the other point of view, security by obscurity is a lack of
  security. Any determined cracker will eventually figure out what
  version of Apache run and what third party modules you have built in.
  
  You can see what information is revealed by your server, by telneting
  to it and issuing some request. For example:
  
    % telnet localhost 8080
    Trying 127.0.0.1
    Connected to localhost
    Escape character is '^]'.
    HEAD / HTTP1.0
    
    HTTP/1.1 200 OK
    Date: Sun, 16 Apr 2000 11:06:25 GMT
    Server: Apache/1.3.12 (Unix) mod_perl/1.22 mod_ssl/2.6.2 OpenSSL/0.9.5
    [more lines snipped]
  
  So as you see that a lot of information is revealed and a C<Full>
  C<ServerTokens> has been used.
  
  
  
  
  
  
  
  =head1 Apache Restarts Twice On Start
  
  When the server is restarted, the configuration and module
  initialization phases are called twice in total before the children
  are forked. The second restart is done in order to ensure that future
  restarts will work correctly, by making sure that all modules can
  survive a restart (C<SIGHUP>). This is very important if you restart a
  production server.
  
  You can control what code will be executed on the start or restart by
  checking the value of C<$Apache::Server::Starting> and
  C<$Apache::Server::ReStarting> respectively. The former variable is
  I<true> when the server is starting and the latter is I<true> when
  it's restarting.
  
  For example:
  
    <Perl>
    print STDERR "Server is Starting\n"   if $Apache::Server::Starting;
    print STDERR "Server is ReStarting\n" if $Apache::Server::ReStarting;
    </Perl>
  
  The I<startup.pl> file and similar loaded via C<PerlModule> or
  C<PerlRequire> are compiled only once. Because once the module is
  compiled it enters the special C<%INC> hash. When Apache
  restarts--Perl checks whether the module or script in question is
  already registered in C<%INC> and won't try to compile it again.
  
  So the only code that you might need to protect from running on
  restart is the one in the C<E<lt>PerlE<gt>> sections. But since one
  usually uses the C<E<lt>PerlE<gt>> sections mainly for on the fly
  configuration creation, there shouldn't be a reason why it'd be
  undesirable to run the code more than once.
  
  =head1 Knowing the proxy_pass'ed Connection Type
  
  Let's say that you have a frontend server running mod_ssl, mod_rewrite
  and mod_proxy. You want to make sure that your user is using a secure
  connection for some specific actions like login information
  submission. You don't want to let the user login unless the request
  was submitted through a secure port.
  
  Since you have to proxy_pass the request between front and backend
  servers, you cannot know where the connection has come from. Neither
  is using the HTTP headers reliable.
  
  A possible solution for this problem is to have the mod_perl
  server listen on two different ports (e.g. 8000 and 8001) and have the
  mod_rewrite proxy rule in the regular server redirect to port 8000 and
  the mod_rewrite proxy rule in the SSL virtual host redirect to port
  8001. In the mod_perl server just check the C<PORT> variable to tell
  if the connection is secure.
  
  
  
  
  
  
  
  
  
  
  =head1 Adding Custom Configuration Directives 
  
  This is covered in the Eagle Book in a great detail. This is just a
  simple example, showing how to add your own Configuration directives.
  
    Makefile.PL
    -----------
    package Apache::TestDirective;
    
    use ExtUtils::MakeMaker;
    
    use Apache::ExtUtils qw(command_table);
    use Apache::src ();
    
    my @directives = (
                    {   name        =>  'Directive4',
                        errmsg      =>  'Anything',
                        args_how    =>  'RAW_ARGS',
                        req_override=>  'OR_ALL',
                    },
                   );
      
    command_table([EMAIL PROTECTED]);
    
    WriteMakefile(
      'NAME'      => 'Apache::TestDirective',
      'VERSION_FROM' => 'TestDirective.pm',
      'INC'       => Apache::src->new->inc,
    );
  
  
  
    TestDirective.pm
    ----------------
    package Apache::TestDirective;
    
    use strict;
    use Apache::ModuleConfig ();
    use DynaLoader ();
  
    if($ENV{MOD_PERL}) {
      no strict;
      $VERSION = '0.01';
      @ISA = qw(DynaLoader);
       __PACKAGE__->bootstrap($VERSION); #command table, etc.
    }
    
    sub Directive4 {
      warn "Directive4 @_\n";
    }
    
    1;
    __END__
  
  
  In the mod_perl source tree, add this to I<t/docs/startup.pl>:
  
    use blib qw(/home/dougm/test/Apache/TestDirective);
  
  and at the bottom of I<t/conf/httpd.conf>:
  
    PerlModule Apache::TestDirective
    Directive4 hi
  
  Test it:
  
    % make start_httpd
    % make kill_httpd
  
  You should see:
  
    Directive4 Apache::TestDirective=HASH(0x83379d0)
    Apache::CmdParms=SCALAR(0x862b80c) hi
  
  And in the error log file:
  
    % grep Directive4 t/logs/error_log 
    Directive4 Apache::TestDirective=HASH(0x83119dc)
    Apache::CmdParms=SCALAR(0x8326878) hi
  
  If it didn't work as expected try building mod_perl with C<PERL_TRACE=1>,
  then do:
  
    setenv MOD_PERL_TRACE all
  
  before starting the server. Now you should get some useful
  diagnostics.
  
  =cut
  
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/control.pod
  
  Index: control.pod
  ===================================================================
  =head1 NAME
  
  Controlling and Monitoring the Server
  
  =head1 Restarting Techniques
  
  All of these techniques require that you know the server process id
  (PID).  The easiest way to find the PID is to look it up in the
  I<httpd.pid> file.  It's easy to discover where to look, by looking in
  the I<httpd.conf> file.  Open the file and locate the entry
  C<PidFile>.  Here is the line from one of my own I<httpd.conf> files:
  
    PidFile /usr/local/var/httpd_perl/run/httpd.pid
  
  As you see, with my configuration the file is
  I</usr/local/var/httpd_perl/run/httpd.pid>.
  
  Another way is to use the C<ps> and C<grep> utilities. Assuming that
  the binary is called I<httpd_perl>, we would do:
  
    % ps auxc | grep httpd_perl
  
  or maybe:
  
    % ps -ef | grep httpd_perl
  
  This will produce a list of all the C<httpd_perl> (parent and
  children) processes.  You are looking for the parent process. If you
  run your server as root, you will easily locate it since it belongs to
  root. If you run the server as some other user (when you L<don't have
  root access|install/Installation_Without_Superuser_P>, the processes
  will belong to that user unless defined differently in I<httpd.conf>.
  It's still easy to find which is the parent--usually it's the process
  with the smallest PID.
  
  You will see several C<httpd> processes running on your system, but you
  should never need to send signals to any of them except the parent,
  whose pid is in the I<PidFile>.  There are three signals that you can
  send to the parent: C<SIGTERM>, C<SIGHUP>, and C<SIGUSR1>.
  
  Some folks prefer to specify signals using numerical values, rather
  than using symbols.  If you are looking for these, check out your
  C<kill(1)> man page.  My page points to
  I</usr/include/linux/signal.h>, the relevant entries are:
  
    #define SIGHUP     1    /* hangup, generated when terminal disconnects */ 
    #define SIGKILL    9    /* last resort */
    #define SIGTERM   15    /* software termination signal */
    #define SIGUSR1   30    /* user defined signal 1 */
  
  Note that to send these signals from the command line the C<SIG> prefix must
  be omitted and under some operating systems they will need to be preceded by
  a minus sign, e.g. C<kill -15> or C<kill -TERM> followed by the PID.
  
  
  =head1 Server Stopping and Restarting 
  
  We will concentrate here on the implications of sending C<TERM>,
  C<HUP>, and C<USR1> signals (as arguments to kill(1)) to a mod_perl
  enabled server.  See http://www.apache.org/docs/stopping.html for
  documentation on the implications of sending these signals to a plain
  Apache server.
  
  =over 4
  
  =item TERM Signal: Stop Now
  
  Sending the C<TERM> signal to the parent causes it to immediately
  attempt to kill off all its children.  Any requests in progress are
  terminated, and no further requests are served.  This process may take
  quite a few seconds to complete.  To stop a child, the parent sends it
  a C<SIGHUP> signal.  If that fails it sends another.  If that fails it
  sends the C<SIGTERM> signal, and as a last resort it sends the
  C<SIGKILL> signal.  For each failed attempt to kill a child it makes
  an entry in the I<error_log>.
  
  When all the child processes were terminated, the parent itself exits
  and any open log files are closed.  This is when all the accumulated
  C<END> blocks, apart from the ones located in scripts running under
  C<Apache::Registry> or C<Apache::PerlRun> handlers.  In the latter
  case, C<END> blocks are executed after each request is served.
  
  =item HUP Signal: Restart Now
  
  Sending the C<HUP> signal to the parent causes it to kill off its
  children as if the C<TERM> signal had been sent, i.e. any requests in
  progress are terminated; but the parent does not exit.  Instead, the
  parent re-reads its configuration files, spawns a new set of child
  processes and continues to serve requests.  It is almost equivalent to
  stopping and then restarting the server.
  
  If the configuration files contain errors when restart is signaled,
  the parent will exit, so it is important to check the configuration
  files for errors before issuing a restart. How to perform the check
  will be covered shortly;
  
  Sometimes using this approach to restart mod_perl enabled Apache may
  cause the processes memory incremental growth after each restart. This
  happens when Perl code loaded in memory is not completely torn down,
  leading to a memory leak.
  
  
  =item USR1 Signal: Gracefully Restart Now
  
  The C<USR1> signal causes the parent process to advise the children to
  exit after serving their current requests, or to exit immediately if
  they're not serving a request.  The parent re-reads its configuration
  files and re-opens its log files.  As each child dies off the parent
  replaces it with a child from the new generation (the new children use
  the new configuration) and it begins serving new requests immediately.
  
  The only difference between C<USR1> and C<HUP> is that C<USR1> allows
  the children to complete any current requests prior to killing them
  off and there is no interruption in the services compared to the
  killing with C<HUP> signal, where it might take a few seconds for a
  restart to get completed and there is no real service at this time.
  
  =back
  
  By default, if a server is restarted (using C<kill -USR1 `cat
  logs/httpd.pid`> or with the C<HUP> signal), Perl scripts and modules
  are not reloaded.  To reload C<PerlRequire>s, C<PerlModule>s, other
  C<use()>'d modules and flush the C<Apache::Registry> cache, use this
  directive in I<httpd.conf>:
  
    PerlFreshRestart On
  
  Make sure you read L<Evil things might happen when using
  PerlFreshRestart|troubleshooting/Evil_things_might_happen_when_us>.
  
  
  =head1 Speeding up the Apache Termination and Restart
  
  We've already mentioned that restart or termination can sometimes take
  quite a long time, (e.g. tens of seconds), for a mod_perl server.  The
  reason for that is a call to the C<perl_destruct()> Perl API function
  during the child exit phase.  This will cause proper execution of
  C<END> blocks found during server startup and will invoke the
  C<DESTROY> method on global objects which are still alive.
  
  It is also possible that this operation may take a long time to
  finish, causing a long delay during a restart.  Sometimes this will be
  followed by a series of messages appearing in the server I<error_log>
  file, warning that certain child processes did not exit as expected.
  This happens when after a few attempts advising the child process to
  quit, the child is still in the middle of perl_destruct(), and a
  lethal C<KILL> signal is sent, aborting any operation the child has
  happened to execute and I<brutally> killing it.
  
  If your code does not contain any C<END> blocks or C<DESTROY> methods
  which need to be run during child server shutdown, or may have these,
  but it's insignificant to execute them, this destruction can be
  avoided by setting the C<PERL_DESTRUCT_LEVEL> environment variable to
  C<-1>. For example add this setting to the I<httpd.conf> file:
  
   PerlSetEnv PERL_DESTRUCT_LEVEL -1
  
  What constitutes a significant cleanup?  Any change of state outside
  of the current process that would not be handled by the operating
  system itself.  So committing database transactions and removing the
  lock on some resource are significant operations, but closing an
  ordinary file isn't.
  
  
  
  
  =head1 Using apachectl to Control the Server
  
  The Apache distribution comes with a script to control the server.
  It's called C<apachectl> and it is installed into the same location as
  the httpd executable.  We will assume for the sake of our examples
  that it's in C</usr/local/sbin/httpd_perl/apachectl>:
  
  To start httpd_perl:
  
    % /usr/local/sbin/httpd_perl/apachectl start 
  
  To stop httpd_perl:
  
    % /usr/local/sbin/httpd_perl/apachectl stop
  
  To restart httpd_perl (if it is running, send C<SIGHUP>; if it is not
  already running just start it):
  
    % /usr/local/sbin/httpd_perl/apachectl restart
  
  Do a graceful restart by sending a C<SIGUSR1>, or start if not
  running:
  
    % /usr/local/sbin/httpd_perl/apachectl graceful
  
  To do a configuration test:
  
    % /usr/local/sbin/httpd_perl/apachectl configtest 
  
  Replace C<httpd_perl> with C<httpd_docs> in the above calls to control
  the C<httpd_docs> server.
  
  There are other options for C<apachectl>, use the C<help> option to
  see them all.
  
  It's important to remember that C<apachectl> uses the PID file, which
  is specified by the C<PIDFILE> directive in I<httpd.conf>.  If you
  delete the PID file by hand while the server is running, C<apachectl>
  will be unable to stop or restart the server.
  
  
  
  
  
  
  =head1 Safe Code Updates on a Live Production Server
  
  You have prepared a new version of code, uploaded it into a production
  server, restarted it and it doesn't work.  What could be worse than
  that?  You also cannot go back, because you have overwritten the good
  working code.
  
  It's quite easy to prevent it, just don't overwrite the previous working
  files!
  
  Personally I do all updates on the live server with the following
  sequence.  Assume that the server root directory is
  I</home/httpd/perl/rel>.  When I'm about to update the files I create
  a new directory I</home/httpd/perl/beta>, copy the old files from
  I</home/httpd/perl/rel> and update it with the new files.  Then I do
  some last sanity checks (check file permissions are [read+executable],
  and run C<perl -c> on the new modules to make sure there no errors in
  them).  When I think I'm ready I do:
  
    % cd /home/httpd/perl
    % mv rel old && mv beta rel && stop && sleep 3 && restart && err
  
  Let me explain what this does.
  
  Firstly, note that I put all the commands on one line, separated by
  C<&&>, and only then press the C<Enter> key.  As I am working
  remotely, this ensures that if I suddenly lose my connection (sadly
  this happens sometimes) I won't leave the server down if only the
  C<stop> command squeezed in.  C<&&> also ensures that if any command
  fails, the rest won't be executed.  I am using aliases (which I have
  already defined) to make the typing easier:
  
    % alias | grep apachectl
    graceful /usr/local/apache/bin/apachectl graceful
    rehup   /usr/local/apache/sbin/apachectl restart
    restart /usr/local/apache/bin/apachectl restart
    start   /usr/local/apache/bin/apachectl start
    stop    /usr/local/apache/bin/apachectl stop
  
    % alias err
    tail -f /usr/local/apache/logs/error_log
  
  Taking the line apart piece by piece:
  
    mv rel old &&
  
  back up the working directory to I<old>
  
    mv beta rel &&
  
  put the new one in its place
  
    stop &&
  
  stop the server
  
    sleep 3 &&
  
  give it a few seconds to shut down (it might take even longer)
  
    restart &&
  
  C<restart> the server
  
    err
  
  view of the tail of the I<error_log> file in order to see that
  everything is OK
  
  C<apachectl> generates the status messages a little too early
  (e.g. when you issue C<apachectl stop> it says the server has been
  stopped, while in fact it's still running) so don't rely on it, rely
  on the C<error_log> file instead.
  
  Also notice that I use C<restart> and not just C<start>.  I do this
  because of Apache's potentially long stopping times (it depends on
  what you do with it of course!).  If you use C<start> and Apache
  hasn't yet released the port it's listening to, the start would fail
  and C<error_log> would tell you that the port is in use, e.g.:
  
    Address already in use: make_sock: could not bind to port 8080
  
  But if you use C<restart>, it will wait for the server to quit and
  then will cleanly restart it.
  
  Now what happens if the new modules are broken?  First of all, I see
  immediately an indication of the problems reported in the C<error_log>
  file, which I C<tail -f> immediately after a restart command.  If
  there's a problem, I just put everything back as it was before:
  
    % mv rel bad && mv old rel && stop && sleep 3 && restart && err
  
  Usually everything will be fine, and I have had only about 10 seconds
  of downtime, which is pretty good!
  
  
  
  
  =head1 An Intentional Disabling of Live Scripts
  
  What happens if you really must take down the server or disable the
  scripts?  This situation might happen when you need to do some
  maintenance work on your database server.  If you have to take your
  database down then any scripts that use it will fail.
  
  If you do nothing, the user will see either the grey C<An Error has
  happened> message or perhaps a customized error message if you have
  added code to trap and customize the errors.  See L<Redirecting Errors
  to the Client instead of to the
  error_log|snippets/Redirecting_Errors_to_the_Client> for the latter
  case.
  
  A much friendlier approach is to confess to your users that you are
  doing some maintenance work and plead for patience, promising (keep
  the promise!) that the service will become fully functional in X
  minutes.  There are a few ways to do this:
  
  The first doesn't require messing with the server.  It works when you
  have to disable a script running under C<Apache::Registry> and relies
  on the fact that it checks whether the file was modified before using
  the cached version.  Obviously it won't work under other handlers
  because these serve the compiled version of the code and don't check
  to see if there was a change in the code on the disk.
  
  So if you want to disable an C<Apache::Registry> script, prepare a
  little script like this:
  
    /home/http/perl/maintenance.pl
    ----------------------------
    #!/usr/bin/perl -Tw
    
    use strict;
    use CGI;
    my $q = new CGI;
    print $q->header, $q->p(
    "Sorry, the service is temporarily down for maintenance. 
     It will be back in ten to fifteen minutes.
     Please, bear with us.
     Thank you!");
  
  So if you now have to disable a script for example
  C</home/http/perl/chat.pl>, just do this:
  
    % mv /home/http/perl/chat.pl /home/http/perl/chat.pl.orig
    % ln -s /home/http/perl/maintenance.pl /home/http/perl/chat.pl
  
  Of course you server configuration should allow symbolic links for
  this trick to work.  Make sure you have the directive
  
    Options FollowSymLinks
  
  in the C<E<lt>LocationE<gt>> or C<E<lt>DirectoryE<gt>> section of your
  I<httpd.conf>.
  
  When you're done, it's easy to restore the previous setup.  Just do
  this:
  
    % mv /home/http/perl/chat.pl.orig /home/http/perl/chat.pl
  
  which overwrites the symbolic link.
  
  Now make sure that the script will have the current timestamp:
  
    % touch /home/http/perl/chat.pl
  
  Apache will automatically detect the change and will use the moved
  script instead.
  
  The second approach is to change the server configuration and
  configure a whole directory to be handled by a C<My::Maintenance>
  handler (which you must write).  For example if you write something
  like this:
  
    My/Maintenance.pm
    ------------------
    package My::Maintenance;
    use strict;
    use Apache::Constants qw(:common);
    sub handler {
      my $r = shift;
      print $r->send_http_header("text/plain");
      print qq{
        We apologize, but this service is temporarily stopped for
        maintenance.  It will be back in ten to fifteen minutes.  
        Please, bear with us.  Thank you!
      };
      return OK;
    }
    1;
  
  and put it in a directory that is in the server's C<@INC>, to disable all
  the scripts in Location C</perl> you would replace:
  
    <Location /perl>
      SetHandler perl-script
      PerlHandler My::Handler
      [snip]
    </Location>
  
  with
  
    <Location /perl>
      SetHandler perl-script
      PerlHandler My::Maintenance
      [snip]
    </Location>
  
  Now restart the server.  Your users will be happy to go and read
  http://slashdot.org for ten minutes, knowing that you are working on a
  much better version of the service.
  
  If you need to disable a location handled by some module, the second
  approach would work just as well.
  
  
  
  
  
  
  
  
  
  =head1 SUID Start-up Scripts
  
  If you want to allow a few people in your team to start and stop the
  server you will have to give them the root password, which is not a
  good thing to do. The less people know the password, the less problems
  are likely to be encountered.  But there is an easy solution for this
  problem available on UNIX platforms.  It's called a setuid executable.
  
  =head2 Introduction to SUID Executables
  
  The setuid executable has a setuid permissions bit set. This sets the
  process's effective user ID to that of the file upon execution. You
  perform this setting with the following command:
  
    % chmod u+s filename
  
  You probably have used setuid executables before without even knowing
  about it. For example when you change your password you execute the
  C<passwd> utility, which among other things modifies the
  I</etc/passwd> file. In order to change this file you need root
  permissions, the C<passwd> utility has the setuid bit set, therefore
  when you execute this utility, its effective ID is the same of the
  root user ID.
  
  You should avoid using setuid executables as a general practice. The
  less setuid executables you have the less likely that someone will
  find a way to break into your system, by exploiting some bug you
  didn't know about.
  
  When the executable is setuid to root, you have to make sure that it
  doesn't have the group and world read and write permissions. If we
  take a look at the C<passwd> utility we will see:
  
    % ls -l /usr/bin/passwd
    -r-s--x--x 1 root root 12244 Feb 8 00:20 /usr/bin/passwd
  
  You achieve this with the following command:
  
    % chmod 4511 filename
  
  The first digit (4) stands for setuid bit, the second digit (5) is a
  compound of read (4) and executable (1) permissions for the user, and
  the third and the fourth digits are setting the executable permissions
  for the group and the world.
  
  =head2 Apache Startup SUID Script's Security
  
  In our case, we want to allow setuid access only to a specific group
  of users, who all belong to the same group. For the sake of our
  example we will use the group named I<apache>. It's important that
  users who aren't root or who don't belong to the I<apache> group will
  not be able to execute this script. Therefore we perform the following
  commands:
  
    % chgrp apache apachectl
    % chmod  4510  apachectl
  
  The execution order is important. If you swap the command execution
  order you will lose the setuid bit.
  
  Now if we look at the file we see:
  
    % ls -l apachectl
    -r-s--x--- 1 root apache 32 May 13 21:52 apachectl
  
  Now we are all set... Almost...
  
  When you start Apache, Apache and Perl modules are being loaded, code
  can be executed. Since all this happens with root effective ID, any
  code executed as if the root user was doing that. You should be very
  careful because while you didn't gave anyone the root password, all
  the users in the I<apache> group have an indirect root access. Which
  means that if Apache loads some module or executes some code that is
  writable by some of these users, users can plant code that will allow
  them to gain a shell access to root account and become a real root.
  
  Of course if you don't trust your team you shouldn't use this solution
  in first place. You can try to check that all the files Apache loads
  aren't writable by anyone but root, but there are too many of them,
  especially in the mod_perl case, where many Perl modules are loaded at
  the server startup.
  
  By the way, don't let all this setuid stuff to confuse you -- when the
  parent process is loaded, the children processes are spawned as
  non-root processes. This section has presented a way to allow non-root
  users to start the server as root user, the rest is exactly the same
  as if you were executing the script as root in first place.
  
  
  =head2 Sample Apache Startup SUID Script
  
  Now if you are still with us, here is an example of the setuid Apache
  startup script.
  
  Note the line marked C<WORKAROUND>, which fixes an obscure error when
  starting mod_perl enabled Apache by setting the real UID to the
  effective UID.  Without this workaround, a mismatch between the real
  and the effective UID causes Perl to croak on the C<-e> switch.
  
  Note that you must be using a version of Perl that recognizes and
  emulates the suid bits in order for this to work.  This script will do
  different things depending on whether it is named C<start_httpd>,
  C<stop_httpd> or C<restart_httpd>.  You can use symbolic links for
  this purpose.
  
    suid_apache_ctl
    ---------------
    #!/usr/bin/perl -T
     
    # These constants will need to be adjusted.
    $PID_FILE = '/home/www/logs/httpd.pid';
    $HTTPD = '/home/www/httpd -d /home/www';
    
    # These prevent taint warnings while running suid
    $ENV{PATH}='/bin:/usr/bin';
    $ENV{IFS}='';
    
    # This sets the real to the effective ID, and prevents
    # an obscure error when starting apache/mod_perl
    $< = $>; # WORKAROUND
    $( = $) = 0; # set the group to root too
    
    # Do different things depending on our name
    ($name) = $0 =~ m|([^/]+)$|;
    
    if ($name eq 'start_httpd') {
        system $HTTPD and die "Unable to start HTTP";
        print "HTTP started.\n";
        exit 0;
    }
    
    # extract the process id and confirm that it is numeric
    $pid = `cat $PID_FILE`;
    $pid =~ /(\d+)/ or die "PID $pid not numeric";
    $pid = $1;
    
    if ($name eq 'stop_httpd') {
        kill 'TERM',$pid or die "Unable to signal HTTP";
        print "HTTP stopped.\n";
        exit 0;
    }
    
    if ($name eq 'restart_httpd') {
        kill 'HUP',$pid or die "Unable to signal HTTP";
        print "HTTP restarted.\n";
        exit 0;
    }
    
    die "Script must be named start_httpd, stop_httpd, or restart_httpd.\n";
  
  
  
  
  
  
  =head1 Preparing for Machine Reboot
  
  When you run your own development box, it's okay to start the
  webserver by hand when you need to.  On a production system it is
  possible that the machine the server is running on will have to be
  rebooted.  When the reboot is completed, who is going to remember to
  start the server?  It's easy to forget this task, and what happens if
  you aren't around when the machine is rebooted?
  
  After the server installation is complete, it's important not to
  forget that you need to put a script to perform the server startup and
  shutdown into the standard system location, for example I</etc/rc.d>
  under RedHat Linux, or I</etc/init.d/apache> under Debian Slink Linux.
  
  This is the directory which contains scripts to start and stop all the
  other daemons.  The directory and file names vary from one Operating
  System (OS) to another, and even between different distributions of
  the same OS.
  
  Generally the simplest solution is to copy the C<apachectl> script to
  your startup directory or create a symbolic link from the startup
  directory to the C<apachectl> script.  You will find C<apachectl> in
  the same directory as the httpd executable after Apache installation.
  If you have more than one Apache server you will need a separate
  script for each one, and of course you will have to rename them so
  that they can co-exist in the same directories.
  
  For example on a RedHat Linux machine with two servers, I have the
  following setup:
  
    /etc/rc.d/init.d/httpd_docs
    /etc/rc.d/init.d/httpd_perl
    /etc/rc.d/rc3.d/S91httpd_docs -> ../init.d/httpd_docs
    /etc/rc.d/rc3.d/S91httpd_perl -> ../init.d/httpd_perl
    /etc/rc.d/rc6.d/K16httpd_docs -> ../init.d/httpd_docs
    /etc/rc.d/rc6.d/K16httpd_perl -> ../init.d/httpd_perl
  
  The scripts themselves reside in the I</etc/rc.d/init.d> directory.
  There are symbolic links to these scripts in other directories. The
  names are the same as the script names but they have numerical
  prefixes, which are used for executing the scripts in a particular
  order: the lower numbers are executed earlier.
  
  When the system starts (level 3) we want the Apache to be started when
  almost all of the services are running already, therefore I've used
  I<S91>. For example if the mod_perl enabled Apache issues a
  C<connect_on_init()> the SQL server should be started before Apache.
  
  When the system shuts down (level 6), Apache should be stopped as one
  of the first processes, therefore I've used C<K16>. Again if the server
  does some cleanup processing during the shutdown event and requires
  third party services to be running (e.g. SQL server) it should be
  stopped before these services.
  
  Notice that it's normal for more than one symbolic link to have the
  same sequence number.
  
  Under RedHat Linux and similar systems, when a machine is booted and
  its runlevel set to 3 (multiuser + network), Linux goes into
  I</etc/rc.d/rc3.d/> and executes the scripts the symbolic links point
  to with the C<start> argument.  When it sees I<S91httpd_perl>, it
  executes:
  
    /etc/rc.d/init.d/httpd_perl start
  
  When the machine is shut down, the scripts are executed through links
  from the I</etc/rc.d/rc6.d/> directory.  This time the scripts are
  called with the C<stop> argument, like this:
  
    /etc/rc.d/init.d/httpd_perl stop
  
  Most systems have GUI utilities to automate the creation of symbolic
  links.  For example RedHat Linux includes the C<control-panel>
  utility, which amongst other things includes the C<RunLevel Manager>.
  (which can be invoked directly as either ntsysv(8) or tksysv(8)).
  This will help you to create the proper symbolic links.  Of course
  before you use it, you should put C<apachectl> or similar scripts into
  the I<init.d> or equivalent directory. Or you can have a symbolic link
  to some other location instead.
  
  The simplest approach is to use the chkconfig(8) utility which adds
  and removes the services for you. The following example shows how to
  add an I<httpd_perl> startup script to the system.
  
  First move or copy the file into the directory I</etc/rc.d/init.d>:
  
    % mv httpd_perl /etc/rc.d/init.d
  
  Now open the script in your favorite editor and add the following
  lines after the main header of the script:
  
    # Comments to support chkconfig on RedHat Linux
    # chkconfig: 2345 91 16
    # description: mod_perl enabled Apache Server
  
  So now the beginning of the script looks like:
  
    #!/bin/sh
    #
    # Apache control script designed to allow an easy command line
    # interface to controlling Apache.  Written by Marc Slemko,
    # 1997/08/23
    
    # Comments to support chkconfig on RedHat Linux
    # chkconfig: 2345 91 16
    # description: mod_perl enabled Apache Server
    
    #
    # The exit codes returned are:
    # ...
  
  Adjust the line:
  
    # chkconfig: 2345 91 16
  
  to your needs. The above setting says to says that the script should
  be started in levels 2, 3, 4, and 5, that its start priority should be
  91, and that its stop priority should be 16.
  
  Now all you have to do is to ask C<chkconfig> to configure the startup
  scripts. Before we do that let's look at what we have:
  
    % find /etc/rc.d | grep httpd_perl
    
    /etc/rc.d/init.d/httpd_perl
  
  Which means that we only have the startup script itself. Now we
  execute:
  
    % chkconfig --add httpd_perl
  
  and see what has changed:
  
    % find /etc/rc.d | grep httpd_perl
    
    /etc/rc.d/init.d/httpd_perl
    /etc/rc.d/rc0.d/K16httpd_perl
    /etc/rc.d/rc1.d/K16httpd_perl
    /etc/rc.d/rc2.d/S91httpd_perl
    /etc/rc.d/rc3.d/S91httpd_perl
    /etc/rc.d/rc4.d/S91httpd_perl
    /etc/rc.d/rc5.d/S91httpd_perl
    /etc/rc.d/rc6.d/K16httpd_perl
  
  As you can see C<chkconfig> created all the symbolic links for us,
  using the startup and shutdown priorities as specified in the line:
  
    # chkconfig: 2345 91 16
  
  If for some reason you want to remove the service from the startup
  scripts, all you have to do is to tell C<chkconfig> to remove the
  links:
  
    % chkconfig --del httpd_perl
  
  Now if we look at the files under the directory I</etc/rc.d/> we see
  again only the script itself.
  
    % find /etc/rc.d | grep httpd_perl
    
    /etc/rc.d/init.d/httpd_perl
  
  Of course you may keep the startup script in any other directory as
  long as you can link to it. For example if you want to keep this file
  with all the Apache binaries in I</usr/local/apache/bin>, all you have
  to do is to provide a symbolic link to this file:
  
    % ln -s /usr/local/apache/bin/apachectl /etc/rc.d/init.d/httpd_perl
  
  and then:
  
    %  chkconfig --add httpd_perl
  
  Note that in case of using symlinks the link name in
  I</etc/rc.d/init.d> is what matters and not the name of the script the
  link points to.
  
  
  
  
  
  =head1 Monitoring the Server.  A watchdog.
  
  With mod_perl many things can happen to your server.  It is possible
  that the server might die when you are not around.  As with any other
  critical service you need to run some kind of watchdog.
  
  One simple solution is to use a slightly modified C<apachectl> script,
  which I've named I<apache.watchdog>.  Call it from the crontab every
  30 minutes -- or even every minute -- to make sure the server is up
  all the time.
  
  The crontab entry for 30 minutes intervals:
  
    0,30 * * * * /path/to/the/apache.watchdog >/dev/null 2>&1
  
  The script:
  
    #!/bin/sh
      
    # this script is a watchdog to see whether the server is online
    # It tries to restart the server, and if it's
    # down it sends an email alert to admin 
    
    # admin's email
    [EMAIL PROTECTED]
      
    # the path to your PID file
    PIDFILE=/usr/local/var/httpd_perl/run/httpd.pid
      
    # the path to your httpd binary, including options if necessary
    HTTPD=/usr/local/sbin/httpd_perl/httpd_perl
          
    # check for pidfile
    if [ -f $PIDFILE ] ; then
      PID=`cat $PIDFILE`
      
      if kill -0 $PID; then
        STATUS="httpd (pid $PID) running"
        RUNNING=1
      else
        STATUS="httpd (pid $PID?) not running"
        RUNNING=0
      fi
    else
      STATUS="httpd (no pid file) not running"
      RUNNING=0
    fi
        
    if [ $RUNNING -eq 0 ]; then
      echo "$0 $ARG: httpd not running, trying to start"
      if $HTTPD ; then
        echo "$0 $ARG: httpd started"
        mail $EMAIL -s "$0 $ARG: httpd started" > /dev/null 2>&1
      else
        echo "$0 $ARG: httpd could not be started"
        mail $EMAIL -s \
        "$0 $ARG: httpd could not be started" > /dev/null 2>&1
      fi
    fi
  
  Another approach, probably even more practical, is to use the cool
  C<LWP> Perl package to test the server by trying to fetch some
  document (script) served by the server.  Why is it more practical?
  Because while the server can be up as a process, it can be stuck and
  not working.  Failing to get the document will trigger restart, and
  "probably" the problem will go away. 
  
  Like before we set a cronjob to call this script every few minutes to
  fetch some very light script.  The best thing of course is to call it
  every minute.  Why so often?  If your server starts to spin and trash
  your disk space with multiple error messages filling the I<error_log>,
  in five minutes you might run out of free disk space which might bring
  your system to its knees.  Chances are that no other child will be
  able to serve requests, since the system will be too busy writing to
  the I<error_log> file.  Think big--if you are running a heavy service
  (which is very fast since you are running under mod_perl) adding one
  more request every minute will not be felt by the server at all.
  
  So we end up with a crontab entry like this:
  
    * * * * * /path/to/the/watchdog.pl >/dev/null 2>&1
  
  And the watchdog itself:
  
    #!/usr/bin/perl -wT
    
    # untaint
    $ENV{'PATH'} = '/bin:/usr/bin';
    delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};
    
    use strict;
    use diagnostics;
    use URI::URL;
    use LWP::MediaTypes qw(media_suffix);
    
    my $VERSION = '0.01';
    use vars qw($ua $proxy);
    $proxy = '';    
    
    require LWP::UserAgent;
    use HTTP::Status;
    
    ###### Config ########
    my $test_script_url = 'http://www.example.com:81/perl/test.pl';
    my $monitor_email   = '[EMAIL PROTECTED]';
    my $restart_command = '/usr/local/sbin/httpd_perl/apachectl restart';
    my $mail_program    = '/usr/lib/sendmail -t -n';
    ######################
    
    $ua  = new LWP::UserAgent;
    $ua->agent("$0/watchdog " . $ua->agent);
    # Uncomment the proxy if you access a machine from behind a firewall
    # $proxy = "http://www-proxy.com";;
    $ua->proxy('http', $proxy) if $proxy;
    
    # If it returns '1' it means we are alive
    exit 1 if checkurl($test_script_url);
    
    # Houston, we have a problem.
    # The server seems to be down, try to restart it. 
    my $status = system $restart_command;
    
    my $message = ($status == 0) 
                ? "Server was down and successfully restarted!" 
                : "Server is down. Can't restart.";
      
    my $subject = ($status == 0) 
                ? "Attention! Webserver restarted"
                : "Attention! Webserver is down. can't restart";
    
    # email the monitoring person
    my $to = $monitor_email;
    my $from = $monitor_email;
    send_mail($from,$to,$subject,$message);
    
    # input:  URL to check 
    # output: 1 for success, 0 for failure
    #######################  
    sub checkurl{
      my ($url) = @_;
    
      # Fetch document 
      my $res = $ua->request(HTTP::Request->new(GET => $url));
    
      # Check the result status
      return 1 if is_success($res->code);
    
      # failed
      return 0;
    } #  end of sub checkurl
    
    # send email about the problem 
    #######################  
    sub send_mail{
      my($from,$to,$subject,$messagebody) = @_;
    
      open MAIL, "|$mail_program"
          or die "Can't open a pipe to a $mail_program :$!\n";
     
      print MAIL <<__END_OF_MAIL__;
    To: $to
    From: $from
    Subject: $subject
    
    $messagebody
    
    __END_OF_MAIL__
    
      close MAIL;
    } 
  
  
  
  
  
  =head1 Running a Server in Single Process Mode
  
  Often while developing new code, you will want to run the server in
  single process mode.  See L<Sometimes it works Sometimes it does
  Not|porting/Sometimes_it_Works_Sometimes_it> and L<Names collisions
  with Modules and libs|porting/Name_collisions_with_Modules_and>.
  Running in single process mode inhibits the server from "daemonizing",
  and this allows you to run it under the control of a debugger more
  easily.
  
    % /usr/local/sbin/httpd_perl/httpd_perl -X
  
  When you use the C<-X> switch the server will run in the foreground of
  the shell, so you can kill it with I<Ctrl-C>.
  
  Note that in C<-X> (single-process) mode the server will run very
  slowly when fetching images.
  
  Note for Netscape users:
  
  If you use Netscape while your server is running in single-process
  mode, HTTP's C<KeepAlive> feature gets in the way.  Netscape tries to
  open multiple connections and keep them open.  Because there is only
  one server process listening, each connection has to time out before
  the next succeeds.  Turn off C<KeepAlive> in I<httpd.conf> to avoid
  this effect while developing.  If you use the image size parameters,
  Netscape will be able to render the page without the images so you can
  press the browser's I<STOP> button after a few seconds.
  
  In addition you should know that when running with C<-X> you will not
  see the control messages that the parent server normally writes to the
  I<error_log> (I<"server started">, I<"server stopped"> etc).  Since
  C<httpd -X> causes the server to handle all requests itself, without
  forking any children, there is no controlling parent to write the
  status messages.
  
  
  
  
  
  
  
  =head1 Starting a Personal Server for Each Developer
  
  If you are the only developer working on the specific server:port you
  have no problems, since you have complete control over the server.
  However, often you will have a group of developers who need to develop
  mod_perl scripts and modules concurrently.  This means that each
  developer will want to have control over the server - to kill it, to
  run it in single server mode, to restart it, etc., as well as having
  control over the location of the log files, configuration settings
  like C<MaxClients>, and so on.
  
  You I<can> work around this problem by preparing a few I<httpd.conf>
  files and forcing each developer to use
  
    httpd_perl -f /path/to/httpd.conf  
  
  but I approach it in a different way.  I use the C<-Dparameter>
  startup option of the server.  I call my version of the server
  
    % http_perl -Dstas
  
  In I<httpd.conf> I write:
  
    # Personal development Server for stas
    # stas uses the server running on port 8000
    <IfDefine stas>
    Port 8000
    PidFile /usr/local/var/httpd_perl/run/httpd.pid.stas
    ErrorLog /usr/local/var/httpd_perl/logs/error_log.stas
    Timeout 300
    KeepAlive On
    MinSpareServers 2
    MaxSpareServers 2
    StartServers 1
    MaxClients 3
    MaxRequestsPerChild 15
    </IfDefine>
    
    # Personal development Server for userfoo
    # userfoo uses the server running on port 8001
    <IfDefine userfoo>
    Port 8001
    PidFile /usr/local/var/httpd_perl/run/httpd.pid.userfoo
    ErrorLog /usr/local/var/httpd_perl/logs/error_log.userfoo
    Timeout 300
    KeepAlive Off
    MinSpareServers 1
    MaxSpareServers 2
    StartServers 1
    MaxClients 5
    MaxRequestsPerChild 0
    </IfDefine>
  
  With this technique we have achieved full control over start/stop,
  number of children, a separate error log file, and port selection for
  each server.  This saves Stas from getting called every few minutes by
  Eric: "Stas, I'm going to restart the server".
  
  In the above technique, you need to discover the PID of your parent
  C<httpd_perl> process, which is written in
  C</usr/local/var/httpd_perl/run/httpd.pid.stas> (and the same for the
  user I<eric>).  To make things even easier we change the I<apachectl>
  script to do the work for us.  We make a copy for each developer
  called B<apachectl.username> and we change two lines in each script:
  
    PIDFILE=/usr/local/var/httpd_perl/run/httpd.pid.username
    HTTPD='/usr/local/sbin/httpd_perl/httpd_perl -Dusername'
  
  So for the user I<stas> we prepare a startup script called
  I<apachectl.stas> and we change these two lines in the standard
  apachectl script as it comes unmodified from Apache distribution.
  
    PIDFILE=/usr/local/var/httpd_perl/run/httpd.pid.stas
    HTTPD='/usr/local/sbin/httpd_perl/httpd_perl -Dstas'
  
  So now when user I<stas> wants to stop the server he will execute:
  
    apachectl.stas stop
  
  And to start:
  
    apachectl.stas start
  
  Certainly the rest of the C<apachectl> arguments apply as before.
  
  You might think about having only one C<apachectl> and know who is
  calling by checking the UID, but since you have to be root to start
  the server it is not possible, unless you make the setuid bit on this
  script, as we've explained in the beginning of this chapter. If you do
  so, you can have a single C<apachectl> script for all developers,
  after you modify it to automatically find out the UID of the user, who
  executes the script and set the right paths.
  
  The last thing is to provide developers with an option to run in
  single process mode by:
  
    /usr/local/sbin/httpd_perl/httpd_perl -Dstas -X
  
  In addition to making life easier, we decided to use relative links
  everywhere in the static documents, including the calls to CGIs.  You
  may ask how using relative links will get to the right server port.
  It's very simple, we use C<mod_rewrite>.
  
  To use mod_rewrite you have to configure your I<httpd_docs> server
  with C<--enable-module=rewrite> and recompile, or use DSO and load the
  module in I<httpd.conf>.  In the I<httpd.conf> of our C<httpd_docs>
  server we have the following code:
  
    RewriteEngine on
    
    # stas's server
    # port = 8000
    RewriteCond  %{REQUEST_URI} ^/(perl|cgi-perl)
    RewriteCond  %{REMOTE_ADDR} 123.34.45.56
    RewriteRule ^(.*)           http://example.com:8000/$1 [P,L]
    
    # eric's server
    # port = 8001
    RewriteCond  %{REQUEST_URI} ^/(perl|cgi-perl)
    RewriteCond  %{REMOTE_ADDR} 123.34.45.57
    RewriteRule ^(.*)           http://example.com:8001/$1 [P,L]
    
    # all the rest
    RewriteCond  %{REQUEST_URI} ^/(perl|cgi-perl)
    RewriteRule ^(.*)           http://example.com:81/$1 [P]
  
  The IP addresses are the addresses of the developer desktop machines
  (where they are running their web browsers).  So if an html file
  includes a relative URI I</perl/test.pl> or even
  I<http://www.example.com/perl/test.pl>, clicking on the link will be
  internally proxied to http://www.example.com:8000/perl/test.pl if the
  click has been made at the user I<stas>'s desktop machine, or to
  I<http://www.example.com:8001/perl/test.pl> for a request generated
  from the user I<eric>'s machine, per our above URI rewrite example.
  
  Another possibility is to use C<REMOTE_USER> variable if all the
  developers are forced to authenticate themselves before they can
  access the server. If you do, you will have to change the
  C<RewriteRule>s to match C<REMOTE_USER> in the above example.
  
  We wish to stress again, that the above setup will work only with
  relative URIs in the HTML code. If you choose to generate full URIs
  including non-80 port the requests originated from this HTML code will
  bypass the light server listening to the default port 80, and go
  directly to the I<server:port> of the full URI.
  
  
  
  
  =head1 Wrapper to Emulate the Server Perl Environment
  
  
  Often you will start off debugging your script by running it from your
  favorite shell program.  Sometimes you encounter a very weird
  situation when the script runs from the shell but dies when processed
  as a CGI script by a web-server.  The real problem often lies in the
  difference between the environment variables that is used by your
  web-server and the ones used by your shell program.
  
  For example you may have a set of non-standard Perl directories, used
  for local Perl modules. You have to tell the Perl interpreter where
  these directories are. If you don't want to modify C<@INC> in all
  scripts and modules, you can use a C<PERL5LIB> environment variable,
  to tell Perl where the directories are. But then you might forget to
  alter the mod_perl startup script to correct C<@INC> there as
  well. And if you forget this, you can be quite puzzled why the scripts
  are running from the shell program, but not from the web. 
  
  Of course the I<error_log> will help as well to find out what the
  problem is, but there can be other obscure cases, where you do
  something different at the shell program and your scripts refuse to
  run under the web-server.
  
  Another example is when you have more than one version of Perl
  installed. You might call the first version of the Perl executable in
  the first script's line (the shebang line), but to have the web-server
  compiled with another Perl version. Since mod_perl ignores the path to
  the Perl executable at the first line of the script, you can get quite
  confused the code won't do the same when processed as request,
  compared to be executed from the command line. it will take a while
  before you realize that you test the scripts from the shell program
  using the I<wrong> Perl version.
  
  The best debugging approach is to write a wrapper that emulates the
  exact environment of the server, first deleting environment variables
  like C<PERL5LIB> and then calling the same perl binary that it is
  being used by the server.  Next, set the environment identical to the
  server's by copying the Perl run directives from the server startup
  and configuration files or even I<require()>'ing the startup file, if
  it doesn't include C<Apache::> modules stuff, unavailable under shell.
  This will also allow you to remove completely the first line of the
  script, since mod_perl doesn't need it anyway and the wrapper knows
  how to call the script.
  
  Here is an example of such a script.  Note that we force the use of
  C<-Tw> when we call the real script. Since when debugging we want to
  make sure that the code is working when the taint mode is on, and we
  want to see all the warnings, to help Perl help us have a better code.
  
  We have also added the ability to pass parameters, which will not
  happen when you will issue a request to script, but it can be helpful
  at times.
  
    #!/usr/bin/perl -w
     
    # This is a wrapper example
     
    # It simulates the web server environment by setting @INC and other
    # stuff, so what will run under this wrapper will run under Web and
    # vice versa. 
    
    #
    # Usage: wrap.pl some_cgi.pl
    #
    BEGIN {
      # we want to make a complete emulation, so we must reset all the
      # paths and add the standard Perl libs
      @INC =
        qw(/usr/lib/perl5/5.00503/i386-linux
         /usr/lib/perl5/5.00503
         /usr/lib/perl5/site_perl/5.005/i386-linux
         /usr/lib/perl5/site_perl/5.005
         .
        );
    }
    
    use strict;
    use File::Basename;
    
      # process the passed params
    my $cgi = shift || '';
    my $params = (@ARGV) ? join(" ", @ARGV) : '';
    
    die "Usage:\n\t$0 some_cgi.pl\n" unless $cgi;
    
      # Set the environment
    my $PERL5LIB = join ":", @INC;
    
      # if the path includes the directory 
      # we extract it and chdir there
    if (index($cgi,'/') >= 0) {
      my $dirname = dirname($cgi);
      chdir $dirname or die "Can't chdir to $dirname: $! \n";
      $cgi =~ m|$dirname/(.*)|;
      $cgi = $1;
    }
    
      # run the cgi from the script's directory
      # Note that we set Warning and Taint modes ON!!!
    system qq{/usr/bin/perl -I$PERL5LIB -Tw $cgi $params};
  
  
  
  =head1 Server Maintenance Chores
  
  It's not enough to have your server and service up and running.  You
  have to maintain the server even when everything seems to be
  fine. This includes security auditing, keeping an eye on the size of
  remaining unused disk space, available RAM, the load of the system,
  etc.
  
  If you forget about these chores one day (sooner or later) your system
  will crash either because it has run out of free disk space, all the
  available CPU has been used and system has started heavily to swap or
  someone has broken in. Unfortunately the scope of this guide is not
  covering the latter, since it will take more than one book to
  profoundly cover this issue, but the rest of the thing are quite easy
  to prevent if you follow our advices.
  
  Certainly, your particular system might have maintenance chores that
  aren't covered here, but at least you will be alerted that these
  chores are real and should be taken care of.
  
  
  
  =head2 Handling Log Files
  
  There are two issues to solve with log files. First they should be
  rotated and compressed on the constant basis, since they tend to use
  big parts of the disk space over time. Second these should be
  monitored for possible sudden explosive growth rates, when something
  goes astray in your code running at the mod_perl server and the
  process starts to log thousands of error messages in second without
  stopping, until all the disk space is used, and the server cannot
  work anymore.
  
  =head3 Log Rotation
  
  The first issue is solved by having a process run by crontab at
  certain times (usually off hours, if this term is still valid in the
  Internet era) and rotate the logs. The log rotation includes the
  current log file renaming, server restart (which creates a fresh new
  log file), and renamed file compression and/or moving it on a
  different disk.
  
  For example if we want to rotate the I<access_log> file we could do:
  
    % mv access_log access_log.renamed
    % apachectl restart
    % sleep 5; # allow all children to complete requests and logging
               # now it's safe to use access_log.renamed
    % mv access_log.renamed /some/directory/on/another/disk
  
  This is the script that we run from the crontab to rotate the log
  files:
  
    #!/usr/local/bin/perl -Tw
    
    # This script does log rotation. Called from crontab.
    
    use strict;
    $ENV{PATH}='/bin:/usr/bin';
    
    ### configuration
    my @logfiles = qw(access_log error_log);
    umask 0;
    my $server = "httpd_perl";
    my $logs_dir = "/usr/local/var/$server/logs";
    my $restart_command = "/usr/local/sbin/$server/apachectl restart";
    my $gzip_exec = "/usr/bin/gzip";
    
    my ($sec,$min,$hour,$mday,$mon,$year) = localtime(time);
    my $time = sprintf "%0.4d.%0.2d.%0.2d-%0.2d.%0.2d.%0.2d",
         $year+1900,++$mon,$mday,$hour,$min,$sec;
    $^I = ".$time";
    
    # rename log files
    chdir $logs_dir;
    @ARGV = @logfiles;
    while (<>) {
      close ARGV;
    }
    
    # now restart the server so the logs will be restarted
    system $restart_command;
    
    # allow all children to complete requests and logging
    sleep 5;
  
    # compress log files
    foreach (@logfiles) {
        system "$gzip_exec $_.$time";
    }
  
  Note: Setting C<$^I> sets the in-place edit flag to a dot followed by
  the time.  We copy the names of the logfiles into C<@ARGV>, and open
  each in turn and immediately close them without doing any changes; but
  because the in-place edit flag is set they are effectively renamed.
  
  As you see the rotated files will include the date and the time in
  their filenames.
  
  Here is a more generic set of scripts for log rotation.  Cron job
  fires off setuid script called log-roller that looks like this:
  
    #!/usr/bin/perl -Tw
    use strict;
    use File::Basename;
    
    $ENV{PATH} = "/usr/ucb:/bin:/usr/bin";
    
    my $ROOT = "/WWW/apache"; # names are relative to this
    my $CONF = "$ROOT/conf/httpd.conf"; # master conf
    my $MIDNIGHT = "MIDNIGHT";  # name of program in each logdir
    
    my ($user_id, $group_id, $pidfile); # will be set during parse of conf
    die "not running as root" if $>;
    
    chdir $ROOT or die "Cannot chdir $ROOT: $!";
    
    my %midnights;
    open CONF, "<$CONF" or die "Cannot open $CONF: $!";
    while (<CONF>) {
      if (/^User (\w+)/i) {
        $user_id = getpwnam($1);
        next;
      }
      if (/^Group (\w+)/i) {
        $group_id = getgrnam($1);
        next;
      }
      if (/^PidFile (.*)/i) {
        $pidfile = $1;
        next;
      }
     next unless /^ErrorLog (.*)/i;
      my $midnight = (dirname $1)."/$MIDNIGHT";
      next unless -x $midnight;
      $midnights{$midnight}++;
    }
    close CONF;
    
    die "missing User definition" unless defined $user_id;
    die "missing Group definition" unless defined $group_id;
    die "missing PidFile definition" unless defined $pidfile;
    
    open PID, $pidfile or die "Cannot open $pidfile: $!";
    <PID> =~ /(\d+)/;
    my $httpd_pid = $1;
    close PID;
    die "missing pid definition" unless defined $httpd_pid and $httpd_pid;
    kill 0, $httpd_pid or die "cannot find pid $httpd_pid: $!";
    
    
    for (sort keys %midnights) {
      defined(my $pid = fork) or die "cannot fork: $!";
      if ($pid) {
        ## parent:
        waitpid $pid, 0;
      } else {
        my $dir = dirname $_;
        ($(,$)) = ($group_id,$group_id);
        ($<,$>) = ($user_id,$user_id);
        chdir $dir or die "cannot chdir $dir: $!";
        exec "./$MIDNIGHT";
        die "cannot exec $MIDNIGHT: $!";
      }
    }
    
    kill 1, $httpd_pid or die "Cannot SIGHUP $httpd_pid: $!";
  
  And then individual C<MIDNIGHT> scripts can look like this:
  
    #!/usr/bin/perl -Tw
    use strict;
    
    die "bad guy" unless getpwuid($<) =~ /^(root|nobody)$/;
    my @LOGFILES = qw(access_log error_log);
    umask 0;
    $^I = ".".time;
    @ARGV = @LOGFILES;
    while (<>) {
      close ARGV;
    }
  
  Can you spot the security holes? Take your time...  This code
  shouldn't be used in hostile situations.
  
  
  
  =head3 Non-Scheduled Emergency Log Rotation
  
  As we have mentioned before, there are times when the web server goes
  wild and starts to log lots of messages to the I<error_log> file
  non-stop.  If no one monitors this, it possible that in a few minutes
  all the free disk spaces will be filled and no process will be able to
  work normally. When this happens, the I/O the faulty server causes is
  so heavy that its sibling processes cannot serve requests.
  
  Generally this not the case, but a few people have reported to
  encounter this problem.  If you are one of these people, you should
  run the monitoring program that checks the log file size and if it
  notices that the file has grown too large, it should attempt to
  restart the server and probably trim the log file.
  
  When we have used a quite old mod_perl version, sometimes we have had
  bursts of an error I<Callback called exit> showing up in our
  I<error_log>.  The file could grow to 300 Mbytes in a few minutes.
  
  We will show you is an example of the script that should be executed
  from the crontab, to handle the situations like this.  The cron job
  should run every few minutes or even every minute, since if you
  experience this problem you know that log files fills up very fast.
  The example script will rotate when the I<error_log> will grow over
  100K. Note that this script is useful when you have the normal
  scheduled log rotation facility working, remember that this one is an
  emergency solver and not to be used for routine log rotation.
  
    emergency_rotate.sh
    -------------------
    #!/bin/sh
    S=`ls -s /usr/local/apache/logs/error_log | awk '{print $1}'`
    if [ "$S" -gt 100000 ] ; then
      mv /usr/local/apache/logs/error_log /usr/local/apache/logs/error_log.old
      /etc/rc.d/init.d/httpd restart
      date | /bin/mail -s "error_log $S kB on inx" [EMAIL PROTECTED]
    fi
  
  Of course you could write a more advanced script, using the timestamps
  and other whistles. This example comes to illustrate how to solve the
  problem in question.
  
  Another solution is to use an out of box tools that are written for
  this purpose. The C<daemontools> package
  (ftp://koobera.math.uic.edu/www/daemontools.html) includes a utility
  called C<multilog>.  This utility saves stdin stream to one or more
  log files. It optionally timestamps each line and, for each log,
  includes or excludes lines matching specified patterns. It
  automatically rotates logs to limit the amount of disk space used. If
  the disk fills up, it pauses and tries again, without losing any data.
  
  The obvious caveat is that it doesn't restart the server, so while it
  tries to solve the log file handling problem it doesn't handle the
  originator of the problem. But since the I/O of the log writing
  process Apache process will be quite heavy, the rest of the servers
  will work very slowly if at all, and a normal watchdog should detect
  this abnormal situation and restart the Apache server.
  
  
  
  =head1 Swapping Prevention
  
  Before I delve into swapping process details, let's refresh our
  knowledge of memory components and memory management
  
  The computer memory is called RAM, which stands for Random Access
  Memory.  Reading and writing to RAM is, by a few orders, faster than
  doing the same operations on a hard disk, the former uses non-movable
  memory cells, while the latter uses rotating magnetic media.
  
  On most operating systems swap memory is used as an extension for RAM
  and not as a duplication of it. So if your OS is one of those, if you
  have 128MB of RAM and 256MB swap partition, you have a total of 384MB
  of memory available. You should never count the extra memory when you
  decide on the maximum number of processes to be run, and I will show
  why in the moment.
  
  The swapping memory can be built of a number of hard disk partitions
  and swap files formatted to be used as swap memory. When you need more
  swap memory you can always extend it on demand as long as you have
  some free disk space (for more information see the I<mkswap> and
  I<swapon> manpages).
  
  System memory is quantified in units called memory pages. Usually the
  size of a memory page is between 1KB and 8KB.  So if you have 256MB of
  RAM installed on your machine and the page size is 4KB your system has
  64,000 main memory pages to work with and these pages are fast.  If
  you have 256MB swap partition the system can use yet another 64,000
  memory pages, but they are much slower.
  
  When the system is started all memory pages are available for use by
  the programs (processes).
  
  Unless the program is really small, the process running this program
  uses only a few segments of the program, each segment mapped onto its
  own memory page. Therefore only a few memory pages are required to be
  loaded into the memory.
  
  When the process needs an additional program's segment to be loaded
  into the memory, it asks the system whether the page containing this
  segment is already loaded in the memory. If the page is not found--an
  event know as a I<page fault> occurs, which requires the system to
  allocate a free memory page, go to the disk, read and load the
  requested program's segment into the allocated memory page.
  
  If a process needs to bring a new page into physical memory and there
  are no free physical pages available, the operating system must make
  room for this page by discarding another page from physical memory.
  
  If the page to be discarded from physical memory came from an image or
  data file and has not been written to then the page does not need to
  be saved. Instead it can be discarded and if the process needs that
  page again it can be brought back into memory from the image or data
  file.
  
  However, if the page has been modified, the operating system must
  preserve the contents of that page so that it can be accessed at a
  later time. This type of page is known as a I<dirty page> and when it
  is removed from memory it is saved in a special sort of file called
  the swap file. This process is referred to as a I<swapping out>.
  
  Accesses to the swap file are very long relative to the speed of the
  processor and physical memory and the operating system must juggle the
  need to write pages to disk with the need to retain them in memory to
  be used again.
  
  In order to improve the swapping out process, to decrease the
  possibility that the page that has just been swapped out, will be
  needed at the next moment, the LRU (least recently used) or a similar
  algorithm is used.
  
  To summarize the two swapping scenarios, read-only pages discarding
  incurs no overhead in contrast with the discarding scenario of the
  data pages that have been written to, since in the latter case the
  pages have to be written to a swap partition located on the slow disk.
  Therefore your machine's overall performance will be much better if
  there will be less memory pages that can become dirty.
  
  But the problem is, Perl is a language with no strong data types,
  which means that both the program code and the program data are seen
  as a data pages by OS since both mapped to the same memory
  pages. Therefore a big chunk of your Perl code becomes dirty when its
  variables are modified and when the pages need to be discarded they
  have to be written to the swap partition.
  
  This leads us to two important conclusions about swapping and Perl.
  
  =over 
  
  =item *
  
  Running your system when there is no free main memory available
  hinders performance, because processes memory pages should be
  discarded and then reread from disk again and again.
  
  =item *
  
  Since a majority of the running code is a Perl code, in addition to
  the overhead of reading the previously discarded pages in, the
  overhead of saving the dirty pages to the swap partition is occurring.
  
  =back
  
  When the system has to swap memory pages in and out, the system slows
  down, not serving the processes as fast as before. This leads to an
  accumulation of processes waiting for their turn to run, which further
  causes processing demands to go up, which in turn slows down the
  system even more as more memory is required.  This ever worsening
  spiral will lead the machine to halt, unless the resource demand
  suddenly drops down and allows the processes to catch up with their
  tasks and go back to normal memory usage.
  
  In addition it's important to know that for a better performance, most
  programs, particularly programs written in Perl, on most modern OSs
  don't return memory pages while they are running. If some of the
  memory gets freed it's reused when needed by the process, without
  creating the additional overhead of asking the system to allocate new
  memory pages.  That's why you will observe that Perl programs grow in
  size as they run and almost never shrink.
  
  When the process quits it returns its memory pages to the pool of
  freely available pages for other processes to use.
  
  This scenario is certainly educating, and it should be now obvious
  that your system that runs the web server should never swap. It's
  absolutely normal for your desktop to start swapping. You will see it
  immediately since things will slow down and sometimes the system will
  freeze for a short periods. But as I've just mentioned, you can stop
  starting new programs and can quit some, thus allowing the system to
  catch up with the load and come back to use the RAM.
  
  In the case of the web server you have much less control since it's
  users who load your machine by issuing requests to your server.
  Therefore you should configure the server, so that the maximum number
  of possible processes will be small enough using the C<MaxClients>
  directive (For the technique for choosing the right C<MaxClients>
  refer to the section 'L<Choosing
  MaxClients|performance/Choosing_MaxClients>'). This will ensure that 
  at peak hours the system won't swap. Remember that swap space is an
  emergency pool, not a resource to be used routinely.  If you are low
  on memory and you badly need it, buy it or reduce the number of
  processes to prevent swapping.
  
  However sometimes, due to the faulty code, some process might start
  spinning in an unconstrained loop, consuming all the available RAM and
  starting to heavily use swap memory. In such a situation it helps when
  you have a big emergency pool (i.e. lots of swap memory). But you have
  to resolve this problem as soon as possible since this pool won't last
  for a long time. In the meanwhile the C<Apache::Resource> module can
  be handy.
  
  For swapping monitoring techniques see the section 'L<Apache::VMonitor
  -- Visual System and Apache Server
  Monitor|debug/Apache_VMonitor_Visual_Syste>'.
  
  
  
  =head1 Preventing mod_perl Processes From Going Wild
  
  Sometimes people report that they had a problem with their code
  running under mod_perl that has caused all the RAM or all the disk to
  be used. The following tips should help you prevent these problems,
  before if at all they hit you.
  
  =head2 All RAM Consumed 
  
  Sometimes calling an undefined subroutine in a module can cause a
  tight loop that consumes all the available memory.  Here is a way to
  catch such errors.  Define an C<UNIVERSAL::AUTOLOAD> subroutine in
  your I<startup.pl>, or in a E<lt>PerlE<gt>E<lt>/PerlE<gt> section in
  your I<httpd.conf> file:
  
    sub UNIVERSAL::AUTOLOAD {
      my $class = shift;
      warn "$class can't \$UNIVERSAL::AUTOLOAD=$UNIVERSAL::AUTOLOAD!\n";
    }
  
  You can either put it in your startup.pl, or in a
  C<E<lt>PerlE<gt>E<lt>/PerlE<gt>> section in your httpd.conf file.  I
  do the latter.  Putting it in all your mod_perl modules would be
  redundant (and might give you compile-time errors).
  
  
  This will produce a nice error in I<error_log>, giving the line number
  of the call and the name of the undefined subroutine.
  
  
  =cut
  
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/correct_headers.pod
  
  Index: correct_headers.pod
  ===================================================================
  =head1 NAME
  
  Issuing Correct HTTP Headers
  
  =head1 SYNOPSIS
  
  As there is always more than one way to do it, I'm tempted to
  believe one must be the best.  Hardly ever am I right.
  
  =head1 The Origin of this Chapter
  
  This chapter has been contributed to the Guide by Andreas Koenig.  You
  will find the references and other related info at the bottom of this
  page.  I'll try to keep it up to date with the Master version which
  resides on CPAN.  If in doubt -- always check the CPAN for
  C<Apache::correct_headers>.
  
  If you have any questions regarding this specific document only,
  please refer to Andreas, since he is the guru on this subject.  On any
  other matter please contact the mod_perl mailing list.
  
  =head1 DESCRIPTION
  
  =head1 1) Why Headers
  
  Dynamic Content is dynamic, after all, so why would anybody care about
  HTTP headers?  Header composition is a task often neglected in the CGI
  world.  Because pages are generated dynamically, you might expect that
  pages without a C<Last-Modified> header are fine, and that an
  C<If-Modified-Since> header in the browser's request can be ignored.
  This laissez-faire principle gets in the way when you try to establish
  a server that is entirely driven by dynamic components and the number
  of hits is significant.
  
  If the number of hits is not significant, don't bother to read this
  document.
  
  If the number of hits is significant, you might want to consider what
  cache-friendliness means (you may also want to read
  L<[4]|correct_headers/_4_>) and how you can cooperate with caches to
  increase the performance of your site.  Especially if you use Squid in
  accelerator mode (helpful hints for Squid, see
  L<[1]|correct_headers/_1_>), you will have a strong motivation to
  cooperate with it.  This document may help you to do it correctly.
  
  =head1 2) Which Headers
  
  The HTTP standard (v 1.1 is specified in L<[3]|correct_headers/_3_>, v
  1.0 in L<[2]|correct_headers/_2_>) describes lots of headers.  In this
  document, we only discuss those headers which are most relevant to
  caching.
  
  I have grouped the headers into three groups: date headers,
  content headers, and the special Vary header.
  
  =head2 2.1) Date Related Headers
  
  =head2 2.1.1) Date
  
  Section 14.18 of the HTTP standard deals with the circumstances under
  which you must or must not send a C<Date> header.  For almost
  everything a normal mod_perl user is doing, a C<Date> header needs to
  be generated.  But the mod_perl programmer doesn't have to worry about
  this header since the Apache server guarantees that this header is
  sent.
  
  In C<http_protocol.c> the C<Date> header is set according to
  C<$r-E<gt>request_time>.  A mod_perl script can read, but not change,
  C<$r-E<gt>request_time>.
  
  =head2 2.1.2) Last-Modified
  
  Section 14.29 of the HTTP standard deals with this.  The
  C<Last-Modified> header is mostly used as a so-called weak
  validator.  Here are two sentences from the HTTP specs:
  
    A validator that does not always change when the resource
    changes is a "weak validator."
  
    One can think of a strong validator as one that changes
    whenever the bits of an entity changes, while a weak value
    changes whenever the meaning of an entity changes.
  
  This tells us that we should consider the semantics of the page we are
  generating and not the date when we are running.  The question is,
  when did the B<meaning> of this page change last time?  Let's imagine
  the document in question is a text-to-gif renderer that takes as input
  a font to use, background and foreground colours, and a string to
  render.  Although the actual image is created on-the-fly, the
  semantics of the page are determined when the script was last changed,
  right?
  
  Actually, a few more things are relevant: the semantics also change a
  little when you update one of the fonts that may be used or when you
  update your C<ImageMagick> or equivalent program.  It's something you
  should consider, if you want to get it right.
  
  If you have a page which comprises several components, you should ask
  all the components when they changed their semantic behaviour last
  time.  Then pick the oldest of those times.
  
  mod_perl offers you two convenient methods to deal with this header:
  update_mtime() and set_last_modified().  These methods and several
  others are unavailable in the normal mod_perl environment but are
  silently imported when you use C<Apache::File>.  Refer to the
  C<Apache::File> manpage for more info.
  
  update_mtime() takes a UNIX time as its argument and sets Apache's
  request structure finfo.st_mtime to this value.  It does so only when
  the argument is greater than a previously stored C<finfo.st_mtime>.
  
  set_last_modified() sets the outgoing header C<Last-Modified> to the
  string that corresponds to the stored finfo.st_mtime.  By passing a
  UNIX time to set_last_modified(), mod_perl calls update_mtime() with
  this argument first.
  
    use Apache::File;
    use Date::Parse;
    # Date::Parse parses RCS format, Apache::Util::parsedate doesn't
    $Mtime ||=
      Date::Parse::str2time(substr q$Date: 2002/01/06 16:54:56 $, 6);
    $r->set_last_modified($Mtime);
  
  =head2 2.1.3) Expires and Cache-Control
  
  Section 14.21 of the HTTP standard deals with the C<Expires>
  header.  The purpose of the C<Expires> header is to determine a point
  in time after which the document should be considered out of date
  (stale).  Don't confuse this with the very different meaning of the
  C<Last-Modified> header.  The C<Expires> header is useful to avoid
  unnecessary validation from now on until the document expires and it
  helps the recipients to clean up their stored documents.  A sentence
  from the HTTP standard:
  
    The presence of an Expires field does not imply that the
    original resource will change or cease to exist at, before, or
    after that time.
  
  So think before you set up a time when you believe a resource should
  be regarded as stale.  Most of the time I can determine an expected
  lifetime from "now", that is the time of the request.  I would not
  recommend hardcoding the date of Expiry, because when you forget that
  you did it, and the date arrives, you will serve "already expired"
  documents that cannot be cached at all by anybody.  If you believe a
  resource will never expire, read this quote from the HTTP specs:
  
    To mark a response as "never expires," an origin server sends an
    Expires date approximately one year from the time the response is
    sent.  HTTP/1.1 servers SHOULD NOT send Expires dates more than one
    year in the future.
  
  Now the code for the mod_perl programmer who wants to expire a
  document half a year from now:
  
    $r->header_out('Expires',
                   HTTP::Date::time2str(time + 180*24*60*60));
  
  A very handy alternative to this computation is available in HTTP 1.1,
  the cache control mechanism. Instead of setting the C<Expires> header
  you can specify a delta value in a C<Cache-Control> header. You can do
  that by executing just:
  
    $r->header_out('Cache-Control', "max-age=" . 180*24*60*60);
  
  which is, of course much cheaper than the first example because perl
  computes the value only once at compile time and optimizes it into a
  constant.
  
  As this alternative is only available in HTTP 1.1 and old cache
  servers may not understand this header, it is advisable to send both
  headers.  In this case the C<Cache-Control> header takes precedence, so
  the C<Expires> header is ignored on HTTP 1.1 compliant servers.  Or you
  could go with an if/else clause:
  
    if ($r->protocol =~ /(\d\.\d)/ && $1 >= 1.1){
      $r->header_out('Cache-Control', "max-age=" . 180*24*60*60);
    } else {
      $r->header_out('Expires',
                     HTTP::Date::time2str(time + 180*24*60*60));
    }
  
  If you restart your Apache server regularly, I'd save the C<Expires>
  header in a global variable.  Oh, well, this is probably
  over-engineered now.
  
  To avoid caching altogether call:
  
    $r->no_cache(1);
  
  which sets the headers:
  
    Pragma: no-cache
    Cache-control: no-cache
  
  which should work in major browsers.
  
  Don't set C<Expires> with C<$r-E<gt>header_out> if you use
  C<$r-E<gt>no_cache>, because header_out() takes precedence.  The
  problem that remains is that there are broken browsers which ignore
  C<Expires> headers.
  
  
  =head2 2.2) Content Related Headers
  
  =head2 2.2.1) Content-Type
  
  You are most probably familiar with C<Content-Type>.  Sections 3.7,
  7.2.1 and 14.17 of the HTTP specs cover the details.  mod_perl has the
  C<content_type()> method to deal with this header, for example:
  
    $r->content_type("image/png");
  
  C<Content-Type> I<should> be included in all messages according to the
  specs, and Apache will generate one if you don't.  It will be whatever
  is specified in the relevant C<DefaultType> configuration directive or
  C<text/plain> if none is active.
  
  
  =head2 2.2.2) Content-Length
  
  According to section 14.13 of the HTTP specifications, the
  C<Content-Length> header is the number of octets in the body of a
  message.  If it can be determined prior to sending, it can be very
  useful for several reasons to include it.  The most important reason
  why it is good to include it is that keepalive requests only work with
  responses that contain a C<Content-Length> header.  In mod_perl you
  can say
  
    $r->header_out('Content-Length', $length);
  
  If you use C<Apache::File>, you get the additional
  C<set_content_length()> method for the Apache class which is a bit
  more efficient than the above.  You can then say:
  
    $r->set_content_length($length);
  
  The C<Content-Length> header can have an important impact on caches by
  invalidating cache entries as the following extract from the
  specification explains:
  
    The response to a HEAD request MAY be cacheable in the sense that
    the information contained in the response MAY be used to update a
    previously cached entity from that resource.  If the new field values
    indicate that the cached entity differs from the current entity (as
    would be indicated by a change in Content-Length, Content-MD5, ETag
    or Last-Modified), then the cache MUST treat the cache entry as
    stale.
  
  So be careful never to send a wrong C<Content-Length>, either in a
  GET or in a HEAD request.
  
  =head2 2.2.3) Entity Tags
  
  An C<Entity Tag> is a validator which can be used instead of, or in
  addition to, the C<Last-Modified> header.  An entity tag is a quoted
  string which can be used to identify different versions of a
  particular resource.  An entity tag can be added to the response
  headers like so:
  
    $r->header_out("ETag","\"$VERSION\"");
  
  Note: mod_perl offers the C<Apache::set_etag()> method if you have
  loaded C<Apache::File>.  It is strongly recommended that you I<do not>
  use this method unless you know what you are doing.  C<set_etag()> is
  expecting to be used in conjunction with a static request for a file
  on disk that has been stat()ed in the course of the current request.
  It is inappropriate and "dangerous" to use it for dynamic content.
  
  By sending an entity tag you promise the recipient that you will not
  send the same C<ETag> for the same resource again unless the content
  is I<'equal'> to what you are sending now (see below for what equality
  means).
  
  The pros and cons of using entity tags are discussed in section 13.3
  of the HTTP specs. For us mod_perl programmers that discussion can be
  summed up as follows:
  
  There are strong and weak validators.  Strong validators change
  whenever a single bit changes in the response.  Weak validators change
  when the meaning of the response changes.  Strong validators are needed
  for caches to allow for sub-range requests.  Weak validators allow a
  more efficient caching of equivalent objects.  Algorithms like MD5 or
  SHA are good strong validators, but what we usually want, when we want
  to take advantage of caching, is a good weak validator.
  
  A C<Last-Modified> time, when used as a validator in a request, can be
  strong or weak, depending on a couple of rules.  Please refer to
  section 13.3.3 of the HTTP standard to understand these rules.  This
  is mostly relevant for range requests as this citation of section
  14.27 explains:
  
    If the client has no entity tag for an entity, but does have a
    Last-Modified date, it MAY use that date in a If-Range header.
  
  But it is not limited to range requests.  Section 13.3.1 succinctly
  states that:
  
    The Last-Modified entity-header field value is often used as a
    cache validator.
  
  The fact that a C<Last-Modified> date may be used as a strong
  validator can be pretty disturbing if we are in fact changing our
  output slightly without changing the semantics of the output.  To
  prevent these kinds of misunderstanding between us and the cache
  servers in the response chain, we can send a weak validator in an
  C<ETag> header.  This is possible because the specs say:
  
    If a client wishes to perform a sub-range retrieval on a value for
    which it has only a Last-Modified time and no opaque validator, it
    MAY do this only if the Last-Modified time is strong in the sense
    described here.
  
  In other words: by sending them an C<ETag> that is marked as weak we
  prevent them from using the Last-Modified header as a strong
  validator.
  
  An C<ETag> value is marked as a weak validator by preceding the
  string C<W/> to the quoted string, otherwise it is strong.  In perl
  this would mean something like this:
  
    $r->header_out('ETag',"W/\"$VERSION\"");
  
  Consider carefully which string you choose to act as a validator.  You
  are on your own with this decision because...
  
    ... only the service author knows the semantics of a resource
    well enough to select an appropriate cache validation
    mechanism, and the specification of any validator comparison
    function more complex than byte-equality would open up a can
    of worms.  Thus, comparisons of any other headers (except
    Last-Modified, for compatibility with HTTP/1.0) are never used
    for purposes of validating a cache entry.
  
  If you are composing a message from multiple components, it may be
  necessary to combine some kind of version information for all these
  components into a single string.
  
  If you are producing relatively large documents, or content that does
  not change frequently, you most likely will prefer a strong entity
  tag, thus giving caches a chance to transfer the document in chunks.
  (Anybody in the mood to add a chapter about ranges to this document?)
  
  =head2 2.3) Content Negotiation
  
  Content negotiation is a particularly wonderful feature that was
  introduced with HTTP 1.1.  Unfortunately it is not yet widely
  supported.  Probably the most popular usage scenario of content
  negotiation is language negotiation.  A user specifies in the browser
  preferences the languages they understand and how well they understand
  them.  The browser includes these settings in an C<Accept-Language>
  header when it sends the request to the server and the server then
  chooses from several available representations of the document the one
  that best fits the user's preferences.  Content negotiation is not
  limited to language.  Citing the specs:
  
    HTTP/1.1 includes the following request-header fields for enabling
    server-driven negotiation through description of user agent
    capabilities and user preferences: Accept (section 14.1), Accept-
    Charset (section 14.2), Accept-Encoding (section 14.3), Accept-
    Language (section 14.4), and User-Agent (section 14.43). However, an
    origin server is not limited to these dimensions and MAY vary the
    response based on any aspect of the request, including information
    outside the request-header fields or within extension header fields
    not defined by this specification.
  
  =head2 2.3.1) Vary
  
  In order to signal to the recipient that content negotiation has been
  used to determine the best available representation for a given
  request, the server must include a C<Vary> header.  This tells the
  recipient which request headers have been used to determine it.  So an
  answer may be generated like this:
  
    $r->header_out('Vary', join ", ", 
          qw(accept accept-language accept-encoding user-agent));
  
  The header of a very cool page may greet the user with something like
  
    Hallo Kraut, Dein NutScrape versteht zwar PNG aber leider
    kein GZIP.
  
  but it has the side effect of being expensive for a caching proxy.  As
  of this writing, Squid (version 2.1PATCH2) does not cache resources
  that come with a Vary header at all.  So unless you find a clever
  workaround, you won't enjoy your Squid accelerator for these documents
  :-(
  
  
  =head1 3) Requests
  
  Section 13.11 of the specifications states that the only two cacheable
  methods are C<GET> and C<HEAD>.
  
  =head2 3.1) HEAD
  
  Among the above recommended headers, the date-related ones (C<Date>,
  C<Last-Modified>, and C<Expires>/C<Cache-Control>) are usually easy to
  produce and thus should be computed for C<HEAD> requests just the same
  as for C<GET> requests.
  
  The C<Content-Type> and C<Content-Length> headers should be exactly
  the same as would be supplied to the corresponding C<GET> request.
  But as it can be expensive to compute them, they can just as well be
  omitted, since there is nothing in the specs that forces you to
  compute them.
  
  What is important for the mod_perl programmer is that the response to
  a C<HEAD> request I<must not> contain a message-body.  The code in your
  mod_perl handler might look like this:
  
    # compute the headers that are easy to compute
    if ( $r->header_only ){ # currently equivalent to $r->method eq "HEAD"
      $r->send_http_header;
      return OK;
    }
  
  If you are running a Squid accelerator, it will be able to handle the
  whole C<HEAD> request for you, but under some circumstances it may not
  be allowed to do so.
  
  =head2 3.2) POST
  
  The response to a C<POST> request is not cacheable due to an
  underspecification in the HTTP standards.  Section 13.4 does not forbid
  caching of responses to C<POST> requests but no other part of the HTTP
  standard explains how caching of C<POST> requests could be
  implemented, so we are in a vacuum here and all existing caching
  servers therefore refuse to implement caching of C<POST>
  requests.  This may change if somebody does the groundwork of defining
  the semantics for cache operations on C<POST>.  Note that some browsers
  with their more aggressive caching do implement caching of C<POST>
  requests.
  
  Note: If you are running a Squid accelerator, you should be aware that
  it accelerates outgoing traffic, but does not bundle incoming traffic.
  If you have long C<POST> requests, Squid doesn't buy you anything.  So
  always consider using a C<GET> instead of a C<POST> if possible.
  
  =head2 3.3) GET
  
  A normal C<GET> is what we usually write our mod_perl programs for.
  Nothing special about it.  We send our headers followed by the body.
  
  But there is a certain case that needs a workaround to achieve better
  cacheability.  We need to deal with the "?" in the rel_path part of
  the requested URI.  Section 13.9 specifies that
  
    ... caches MUST NOT treat responses to such URIs as fresh unless
    the server provides an explicit expiration time.  This specifically
    means that responses from HTTP/1.0 servers for such URIs SHOULD NOT
    be taken from a cache.
  
  You're tempted to believe that if we are using HTTP 1.1 and send an
  explicit expiration time we're on the safe side?  Unfortunately
  reality is a little bit different.  It has been a bad habit for quite
  a long time to misconfigure cache servers such that they treat all
  C<GET> requests containing a question mark as uncacheable.  People
  even used to mark everything as uncacheable that contained the string
  C<cgi-bin>.
  
  To work around this bug in the C<HEAD> requests, I have stopped
  calling my CGI directories C<cgi-bin> and I have written the following
  handler that lets me work with CGI-like query strings without
  rewriting the software (such as C<Apache::Request> and C<CGI.pm>) that
  deals with them.
  
    sub handler {
      my($r) = @_;
      my $uri = $r->uri;
      if ( my($u1,$u2) = $uri =~ / ^ ([^?]+?) ; ([^?]*) $ /x ) {
        $r->uri($u1);
        $r->args($u2);
      } elsif ( my($u1,$u2) = $uri =~ m/^(.*?)%3[Bb](.*)$/ ) {
        # protect against old proxies that escape volens nolens
        # (see HTTP standard section 5.1.2)
        $r->uri($u1);
        $u2 =~ s/%3B/;/gi;
        $u2 =~ s/%26/;/gi; # &
        $u2 =~ s/%3D/=/gi;
        $r->args($u2);
      }
      DECLINED;
    }
  
  This handler must be installed as a C<PerlPostReadRequestHandler>.
  
  The handler takes any request that contains one or more semicolons but
  I<no> question mark such that the first semicolon is interpreted as a
  question mark and everything after that as the query string.  You can
  now exchange the request:
  
    http://example.com/query?BGCOLOR=blue;FGCOLOR=red
  
  with:
  
    http://example.com/query;BGCOLOR=blue;FGCOLOR=red
  
  Thus it allows the co-existence of queries from ordinary forms that
  are being processed by a browser and predefined requests for the same
  resource.  It has one minor bug: Apache doesn't allow percent-escaped
  slashes in such a query string.  So instead of:
  
    http://example.com/query;BGCOLOR=blue;FGCOLOR=red;FONT=%2Ffont%2Fbla
  
  you have to use:
  
    http://example.com/query;BGCOLOR=blue;FGCOLOR=red;FONT=/font/bla
  
  
  =head2 3.4) Conditional GET
  
  A rather challenging request mod_perl programmers can get is the
  conditional C<GET>, which typically means a request with an
  If-Modified-Since header.  The HTTP specifications have this to say:
  
    The semantics of the GET method change to a "conditional GET"
    if the request message includes an If-Modified-Since,
    If-Unmodified-Since, If-Match, If-None-Match, or If-Range
    header field.  A conditional GET method requests that the
    entity be transferred only under the circumstances described
    by the conditional header field(s). The conditional GET method
    is intended to reduce unnecessary network usage by allowing
    cached entities to be refreshed without requiring multiple
    requests or transferring data already held by the client.
  
  So how can we reduce the unnecessary network usage in such a case?
  mod_perl makes it easy for you by offering Apache's
  C<meets_conditions()>.  You have to set up your C<Last-Modified> (and
  possibly C<ETag>) header before calling this method.  If the return
  value of this method is anything other than C<OK>, you should return
  that value from your handler and you're done.  Apache handles the rest
  for you.  The following example is taken from
  L<[5]|correct_headers/_5_>:
  
    if((my $rc = $r->meets_conditions) != OK) {
       return $rc;
    }
    #else ... go and send the response body ...
  
  If you have a Squid accelerator running, it will often handle the
  conditionals for you and you can enjoy its extremely fast responses
  for such requests by reading the I<access.log>.  Just grep for
  C<TCP_IMS_HIT/304>.  But as with a C<HEAD> request there are
  circumstances under which it may not be allowed to do so.  That is why
  the origin server (which is the server you're programming) needs to
  handle conditional C<GET>s as well even if a Squid accelerator is
  running.
  
  =head1 4.) Avoiding Dealing with Headers
  
  There is another approach to dynamic content that is possible with
  mod_perl.  This approach is appropriate if the content changes
  relatively infrequently, if you expect lots of requests to retrieve
  the same content before it changes again and if it is much cheaper to
  test whether the content needs refreshing than it is to refresh it.
  
  In this case a C<PerlFixupHandler> can be installed for the relevant
  location.  It tests whether the content is up to date.  If so, it
  returns C<DECLINED> and lets the Apache core serve the content from a
  file.  Otherwise, it regenerates the content into the file, updates
  the C<$r-E<gt>finfo> status and again returns C<DECLINED> so that
  Apache serves the updated file.  Updating C<$r-E<gt>finfo> can be
  achieved by calling
  
    $r->filename($file); # force update of finfo
  
  even if this seems redundant because the filename is already equal to
  C<$file>.  Setting the filename has the side effect of doing a
  C<stat()> on the file.  This is important because otherwise Apache
  would use the out of date C<finfo> when generating the response
  header.
  
  
  =head1 References
  
  =head2 [1]
  
  Stas Bekman: mod_perl Guide. http://perl.apache.org/guide/
  
  =head2  [2]
  
  T. Berners-Lee et al.: Hypertext Transfer Protocol -- HTTP/1.0, RFC
  1945.
  
  =head2 [3]
  
  R. Fielding et al.: Hypertext Transfer Protocol -- HTTP/1.1, RFC 2616.
  
  =head2 [4]
  
  Martin Hamilton: Cachebusting - cause and prevention,
  draft-hamilton-cachebusting-01. Also available online at
  http://vancouver-webpages.com/CacheNow/
  
  =head2 [5]
  
  Lincoln Stein, Doug MacEachern: Writing Apache Modules with Perl and
  C, O'Reilly, 1-56592-567-X. Selected chapters available online at
  http://www.modperl.com . Amazon page at
  http://www.amazon.com/exec/obidos/ASIN/156592567X/writinapachemodu/
  
  
  =head1 Other resources
  
  =over
  
  =item *
  
  Prevent the browser from Caching a page
  http://www.pacificnet.net/~johnr/meta.html
  
  This page is an explanation of using the Meta tag to prevent caching, by
  browser or proxy, of an individual page wherein the page in question has
  data that may be of a sensitive nature as in a "form page for submittal"
  and the creator of the page wants to make sure that the page does not get
  submitted twice. Please notice that some of the information on this page
  is a little bit outdated, but it's still a good resource for those who
  cannot generate their own HTTP headers.
  
  =item *
  
  Web Caching and Content Delivery Resources
  http://www.web-caching.com/
  
  =item *
  
  
  =back
  
  
  
  =cut
  
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/databases.pod
  
  Index: databases.pod
  ===================================================================
  =head1 NAME
  
  mod_perl and Relational Databases
  
  =head1 Why Relational (SQL) Databases
  
  Nowadays millions of people surf the Internet. There are millions of
  Terabytes of data lying around. To manipulate the data new smart
  techniques and technologies were invented. One of the major inventions
  was the relational database, which allows us to search and modify huge
  stores of data very quickly. We use B<SQL> (Structured Query Language)
  to access and manipulate the contents of these databases.
  
  =head1 Apache::DBI - Initiate a persistent database connection
  
  When people started to use the web, they found that they needed to
  write web interfaces to their databases.  CGI is the most widely used
  technology for building such interfaces. The main limitation of a CGI
  script driving a database is that its database connection is not
  persistent - on every request the CGI script has to re-connect to the
  database, and when the request is completed the connection is closed.
  
  C<Apache::DBI> was written to remove this limitation. When you use it,
  you have a database connection which persists for the process' entire
  life.  So when your mod_perl script needs to use a database,
  C<Apache::DBI> provides a valid connection immediately and your script
  starts work right away without having to initiate a database
  connection first.
  
  This is possible only with CGI running under a mod_perl enabled
  server, since in this model the child process does not quit when the
  request has been served.
  
  It's almost as straightforward as is it sounds; there are just a few
  things to know about and we will cover them in this section.
  
  
  =head2 Introduction
  
  The DBI module can make use of the C<Apache::DBI> module.  When it
  loads, the DBI module tests if the environment variable
  C<$ENV{MOD_PERL}> is set, and if the C<Apache::DBI> module has already
  been loaded. If so, the DBI module will forward every connect()
  request to the C<Apache::DBI> module. C<Apache::DBI> uses the ping()
  method to look for a database handle from a previous connect()
  request, and tests if this handle is still valid.  If these two
  conditions are fulfilled it just returns the database handle.
  
  If there is no appropriate database handle or if the ping() method
  fails, C<Apache::DBI> establishes a new connection and stores the
  handle for later re-use. When the script is run again by a child that
  is still connected, C<Apache::DBI> just checks the cache of open
  connections by matching the I<host>, I<username> and I<password>
  parameters against it. A matching connection is returned if available
  or a new one is initiated and then returned.
  
  There is no need to delete the disconnect() statements from your
  code. They won't do anything because the C<Apache::DBI> module
  overloads the disconnect() method with an empty one.
  
  =head2 When should this module be used and when shouldn't it be used?
  
  You will want to use this module if you are opening several database
  connections to the server. C<Apache::DBI> will make them persistent
  per child, so if you have ten children and each opens two different
  connections (with different connect() arguments) you will have in
  total twenty opened and persistent connections. After the initial
  connect() you will save the connection time for every connect()
  request from your C<DBI> module. This can be a huge benefit for a
  server with a high volume of database traffic.
  
  You must B<not> use this module if you are opening a special connection
  for each of your users.  Each connection will stay persistent and in a
  short time the number of connections will be so big that your machine
  will scream in agony and die.
  
  If you want to use C<Apache::DBI> but you have both situations on one
  machine, at the time of writing the only solution is to run two
  Apache/mod_perl servers, one which uses C<Apache::DBI> and one which
  does not.
  
  =head2 Configuration
  
  After installing this module, the configuration is simple - add the
  following directive to C<httpd.conf>
  
    PerlModule Apache::DBI
  
  Note that it is important to load this module before any other
  C<Apache*DBI> module and before the C<DBI> module itself!
  
  You can skip preloading C<DBI>, since C<Apache::DBI> does that. But
  there is no harm in leaving it in, as long as it is loaded after
  C<Apache::DBI>.
  
  =head2 Preopening DBI connections
  
  If you want to make sure that a connection will already be opened when
  your script is first executed after a server restart, then you should
  use the C<connect_on_init()> method in the startup file to preload
  every connection you are going to use. For example:
  
    Apache::DBI->connect_on_init
    ("DBI:mysql:myDB:myserver",
     "username",
     "passwd",
     {
      PrintError => 1, # warn() on errors
      RaiseError => 0, # don't die on error
      AutoCommit => 1, # commit executes immediately
     }
    );
  
  As noted above, use this method only if you want all of apache to be
  able to connect to the database server as one user (or as a very few
  users), i.e. if your user(s) can effectively share the connection. Do
  B<not> use this method if you want for example one unique connection
  per user.
  
  Be warned though, that if you call C<connect_on_init()> and your
  database is down, Apache children will be delayed at server startup,
  trying to connect. They won't begin serving requests until either they
  are connected, or the connection attempt fails.  Depending on your DBD
  driver, this can take several minutes!
  
  =head2 Debugging Apache::DBI
  
  If you are not sure if this module is working as advertised, you should
  enable Debug mode in the startup script by:
  
    $Apache::DBI::DEBUG = 1;
  
  Starting with C<ApacheDBI-0.84>, setting C<$Apache::DBI::DEBUG = 1>
  will produce only minimal output. For a full trace you should set
  C<$Apache::DBI::DEBUG = 2>.
  
  After setting the DEBUG level you will see entries in the C<error_log>
  both when C<Apache::DBI> initializes a connection and when it returns one
  from its cache.  Use the following command to view the log in real
  time (your C<error_log> might be located at a different path, it is
  set in the Apache configuration files):
  
    tail -f /usr/local/apache/logs/error_log
  
  I use C<alias> (in C<tcsh>) so I do not have to remember the path:
  
    alias err "tail -f /usr/local/apache/logs/error_log"
  
  =head2 Database Locking Risks
  
  Be very careful when locking the database (C<LOCK TABLE ...>) or
  singular rows if you use C<Apache::DBI> or similar persistent
  connections.  MySQL threads keep tables locked until the thread ends
  (connection is closed) or the tables are unlocked.  If your session
  die()'s while tables are locked, they will stay neatly locked as your
  connection won't be closed either.
  
  See the section L<Handling the 'User pressed Stop button'
  case|debug/Handling_the_User_pressed_Stop_> for more information on
  prevention.
  
  =head2 Troubleshooting
  
  
  =head3 The Morning Bug
  
  The SQL server keeps a connection to the client open for a limited
  period of time. In the early days of C<Apache::DBI> developers were
  bitten by so called I<Morning bug>, when every morning the first users
  to use the site received a C<No Data Returned> message, but after that
  everything worked fine.
  
  The error was caused by C<Apache::DBI> returning a handle of the
  invalid connection (the server closed it because of a timeout), and
  the script was dying on that error. The C<ping()> method was
  introduced to solve this problem, but it didn't worked properly till
  C<Apache::DBI> version 0.82 was released. In that version and
  afterwards ping() was called inside the C<eval> block, which resolved
  the problem.
  
  It's possible that some C<DBD::> drivers don't have the ping() method
  implemented. The C<Apache::DBI> manpage explains how to write one.
  
  Another solution was found - to increase the timeout parameter when
  starting the database server. Currently we startup C<MySQL> server
  with a script C<safe_mysql>, so we have modified it to use this
  option:
  
    nohup $ledir/mysqld [snipped other options] -O wait_timeout=172800
  
  172800 seconds is equal to 48 hours. This change solves the problem,
  but the ping() method works properly in C<DBD::mysql> as well.
  
  =head3 Opening connections with different parameters
  
  When it receives a connection request, before it decides to use an
  existing cached connection, C<Apache::DBI> insists that the new
  connection be opened in exactly the same way as the cached
  connection. If I have one script that sets C<LongReadLen> and one that
  does not, C<Apache::DBI> will make two different connections.  So instead
  of having a maximum of 40 open connections, I can end up with 80.
  
  However, you are free to modify the handle immediately after you get
  it from the cache. So always initiate connections using the same
  parameters and set C<LongReadLen> (or whatever) afterwards.
  
  
  =head3 Cannot find the DBI handler
  
  You must use C<DBI::connect()> as in normal DBI usage to get your $dbh
  database handler. Using the C<Apache::DBI> does not eliminate the need
  to write proper C<DBI> code. As the C<Apache::DBI> man page states,
  you should program as if you are not using C<Apache::DBI> at
  all. C<Apache::DBI> will override the DBI methods where necessary and
  return your cached connection. Any C<disconnect()> call will be just
  ignored.
  
  
  =head3 Apache:DBI does not work
  
  Make sure you have it installed.
  
  Make sure you configured mod_perl with either:
  
    PERL_CHILD_INIT=1 PERL_STACKED_HANDLERS=1
  
  or 
  
    EVERYTHING=1
  
  Use the example script eg/startup.pl (in the mod_perl
  distribution). Remove the comment from the line.
  
    # use Apache::DebugDBI;
  
  and adapt the connect string.  Do not change anything in your scripts
  for use with C<Apache::DBI>.
  
  
  =head3 Skipping connection cache during server startup
  
  Does your error_log look like this?
  
    10169 Apache::DBI PerlChildInitHandler
    10169 Apache::DBI skipping connection cache during server startup
    Database handle destroyed without explicit disconnect at
    /usr/lib/perl5/site_perl/5.005/Apache/DBI.pm line 29.
  
  If so you are trying to open a database connection in the parent httpd
  process. If you do, children will each get a copy of this handle,
  causing clashes when the handle is used by two processes at the same
  time. Each child must have its own, unique, connection handle.
  
  To avoid this problem, C<Apache::DBI> checks whether it is called
  during server startup. If so the module skips the connection cache and
  returns immediately without a database handle.
  
  You must use the C<Apache::DBI-E<gt>connect_on_init()> method in the
  startup file.
  
  
  
  =head3 Debugging code which deploys DBI
  
  To log a trace of C<DBI> statement execution, you must set the
  C<DBI_TRACE> environment variable. The C<PerlSetEnv DBI_TRACE>
  directive must appear before you load C<Apache::DBI> and C<DBI>.
  
  For example if you use C<Apache::DBI>, modify your C<httpd.conf> with:
  
    PerlSetEnv DBI_TRACE "3=/tmp/dbitrace.log"
    PerlModule Apache::DBI
  
  Replace C<3> with the TRACE level you want. The traces from each
  request will be appended to C</tmp/dbitrace.log>. Note that the logs
  might interleave if requests are processed concurrently.
  
  Within your code you can control trace generation with the trace()
  method:
  
    DBI->trace($trace_level)
    DBI->trace($trace_level, $trace_filename)
  
  DBI trace information can be enabled for all handles using this DBI
  class method. To enable trace information for a specific handle use
  the similar C<$h-E<gt>trace> method.
  
  Using the handle trace option with a C<$dbh> or C<$sth> is handy for
  limiting the trace info to the specific bit of code that you are
  interested in.
  
  Trace Levels:
  
  =over
  
  =item * 0 - trace disabled.
  
  =item * 1 - trace DBI method calls returning with results.
  
  =item * 2 - trace method entry with parameters and exit with results.
  
  =item * 3 - as above, adding some high-level information from the
  driver and also adding some internal information from the DBI.
  
  =item * 4 - as above, adding more detailed information from the driver
  and also including DBI mutex information when using threaded perl.
  
  =item * 5 and above - as above but with more and more obscure
  information.
  
  =back
  
  
  =head1 mysql_use_result vs. mysql_store_result.
  
  Since many mod_perl developers use mysql as their preferred SQL
  engine, these notes explain the difference between C<mysql_use_result()> and
  C<mysql_store_result()>. The two influence
  the speed and size of the processes.
  
  The C<DBD::mysql> (version 2.0217) documentation includes the
  following snippet:
  
    mysql_use_result attribute: This forces the driver to use
    mysql_use_result rather than mysql_store_result. The former is
    faster and less memory consuming, but tends to block other
    processes. (That's why mysql_store_result is the default.)
  
  Think about it in client/server terms.  When you ask the server to
  spoon-feed you the data as you use it, the server process must buffer
  the data, tie up that thread, and possibly keep any database locks
  open for a long time.  So if you read a row of data and ponder it for
  a while, the tables you have locked are still locked, and the server
  is busy talking to you every so often. That is C<mysql_use_result()>.
  
  If you just suck down the whole dataset to the client, then the server
  is free to go about its business serving other requests.  This results
  in parallelism since the server and client are doing work at the same
  time, rather than blocking on each other doing frequent I/O. That is
  C<mysql_store_result()>.
  
  As the mysql manual suggests: you should not use C<mysql_use_result()>
  if you are doing a lot of processing for each row on the client side.
  This can tie up the server and prevent other threads from updating the
  tables.
  
  
  =head1 Optimize: Run Two SQL Engine Servers
  
  Sometimes you end up running many databases on the same machine. These
  might have very varying database needs (such as one db with sessions,
  very frequently updated but tiny amounts of data, and another with
  large sets of data that's hardly ever updated) you might be able to
  gain a lot by running two differently configured databases on one
  server. One would benefit from lots of caching, the other would
  probably reduce the efficiency of the cache but would gain from fast
  disk access. Different usage profiles require vastly different
  performance needs.
  
  This is basically a similar idea to having L<two Apache
  servers|strategy/One_Plain_Apache_and_One_mod_per>, each optimized for
  its specific requirements.
  
  
  =head1 Some useful code snippets to be used with relational Databases
  
  In this section you will find scripts, modules and code snippets to
  help you get started using relational Databases with mod_perl
  scripts. Note that I work with C<mysql> ( http://www.mysql.com ), so
  the code you find here will work out of box with mysql. If you use
  some other SQL engine, it might work for you or it might need some
  changes.  YMMV.
  
  =head2 Turning SQL query writing into a short and simple task
  
  Having to write many queries in my CGI scripts, persuaded me to write
  a stand alone module that saves me a lot of time in coding and
  debugging my code. It also makes my scripts much smaller and easier to
  read. I will present the module here, with examples following:
  
  Notice the C<DESTROY> block at the end of the module, which makes
  various cleanups and allows this module to be used under mod_perl and
  C<mod_cgi> as well. Note that you will not get the benefit of
  persistent database handles with mod_cgi.
  
  =head2 The My::DB module
  
  =code My-DB.pm The My::DB module
  
  (Note that you will not find this on CPAN. at least not yet :)
  
  
  =head2 My::DB Module's Usage Examples
  
  To use C<My::DB> in your script, you first have to create a C<My::DB>
  object:
  
    use vars qw($db_obj);
    my $db_obj = new My::DB or croak "Can't initialize My::DB object: $!\n";
  
  Now you can use any of C<My::DB>'s methods. Assume that we have a
  table called I<tracker> where we store the names of the users and what
  they are doing at each and every moment (think about an online community
  program).
  
  I will start with a very simple query--I want to know where the users
  are and produce statistics. C<tracker> is the name of the table.
  
      # fetch the statistics of where users are
    my $r_ary = $db_obj->sql_get_matched_rows_ary_ref
      ("tracker",
       [qw(where_user_are)],
      );
    
    my %stats = ();
    my $total = 0;
    foreach my $r_row (@$r_ary){
      $stats{$r_row->[0]}++;
      $total++;
    }
  
  
  Now let's count how many users we have (in table C<users>):
  
    my $count = $db_obj->sql_count_matched("users");
  
  Check whether a user exists:
  
    my $username = 'stas';
    my $exists = $db_obj->sql_count_matched
    ("users",
     [username => ["=",$username]]
    );
  
  Check whether a user is online, and get the time since she went online
  (C<since> is a column in the C<tracker> table, it tells us when a user
  went online):
  
    my @row = ();
    $db_obj->sql_get_matched_row
    ([EMAIL PROTECTED],
     "tracker",
     ['UNIX_TIMESTAMP(since)'],
     [username => ["=",$username]]
    );
    
    if (@row) {
      my $idle = int( (time() - $row[0]) / 60);
      return "Current status: Is Online and idle for $idle minutes.";
    }
  
  
  A complex query. I join two tables, and I want a reference to an array
  which will store a slice of the matched query (C<LIMIT $offset,$hits>)
  sorted by C<username>. Each row in the array is to include the fields
  from the C<users> table, but only those listed in C<@verbose_cols>.
  Then we print it out.
  
    my $r_ary = $db_obj->sql_get_matched_rows_ary_ref
      (
       "tracker STRAIGHT_JOIN users",
       [map {"users.$_"} @verbose_cols],
       [],
       ["WHERE tracker.username=users.username",
        "ORDER BY users.username",
        "LIMIT $offset,$hits"],
      );
    
    foreach my $r_row (@$r_ary){
      print ...
    }
  
  Another complex query. The user checks checkboxes to be queried by,
  selects from lists and types in match strings, we process input and
  build the C<@where> array. Then we want to get the number of matches
  and the matched rows as well.
  
    my @search_keys = qw(choice1 choice2);
    my @where = ();
      # Process the checkboxes - we turn them into a regular expression
    foreach (@search_keys) {
      next unless defined $q->param($_) and $q->param($_);
      my $regexp = "[".join("",$q->param($_))."]";
      push @where, ($_ => ['REGEXP',$regexp]);
    }
    
      # Add the items selected by the user from our lists
      # selected => exact match
    push @where,(country => ['=',$q->param('country')]) if $q->param('country');
    
      # Add the parameters typed by the user
    foreach (qw(city state)) {
      push @where,($_ => ['LIKE',$q->param($_)]) if $q->param($_);
    }
    
       # Count all that matched the query
    my $total_matched_users =  $db_obj->sql_count_matched
      (
       "users",
       [EMAIL PROTECTED],
      );
    
      # Now process the orderby
    my $orderby = $q->param('orderby') || 'username';
    
       # Do the query and fetch the data
    my $r_ary = $db_obj->sql_get_matched_rows_ary_ref
    (
     "users",
     [EMAIL PROTECTED],
     [EMAIL PROTECTED],
     ["ORDER BY $orderby",
      "LIMIT $offset,$hits"],
    );
  
  C<sql_get_matched_rows_ary_ref> knows to handle both C<OR>ed and
  C<AND>ed params. This example shows how to use C<OR> on parameters:
  
  This snippet is an implementation of a watchdog.  Our users want to
  know when their colleagues go online.  They register the usernames of
  the people they want to know about.  We have to make two queries: one
  to get a list of usernames, the second to find out whether any of
  these users is online. In the second query we use the C<OR> keyword.
  
    # check who we are looking for
    $r_ary = $db_obj->sql_get_matched_rows_ary_ref
      ("watchdog",
       [qw(watched)],
       [username => ['=',$username)],
       ],
      );
    
      # put them into an array
    my @watched = map {$_->[0]} @{$r_ary};
    
    my %matched = ();
      # Does the user have some registered usernames?
    if (@watched) {
    
    # Try to fetch all the users who match the usernames exactly.
    # Put it into an array and compare it with a hash!
      $r_ary = $db_obj->sql_get_matched_rows_ary_ref
        ("tracker",
         [qw(username)],
         [username => ['=',[EMAIL PROTECTED],
         ]
        );
    
      map {$matched{$_->[0]} = 1} @{$r_ary};
    }
    
    # Now %matched includes the usernames of the users who are being
    # watched by $username and currently are online.
  
  
  =cut
  
  
  
  
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/dbm.pod
  
  Index: dbm.pod
  ===================================================================
  =head1 NAME
  
  mod_perl and dbm files
  
  
  
  =head1 Where and Why to use dbm files
  
  Some of the earliest databases implemented on Unix were dbm files, and
  many are still in use today.  As of this writing the Berkeley DB is
  the most powerful dbm implementation (http://www.sleepycat.com).
  
  If you need a light database, with an easy API, using simple key-value
  pairs to store and manipulate a relatively small number of records,
  this is a solution that should be amongst the first you consider.
  
  With dbm, it is rare to read the whole database into memory.  Combine
  this feature with the use of smart storage techniques, and dbm files
  can be manipulated much faster than flat files.  Flat file databases
  can be very slow on insert, update and delete operations, when the
  number of records starts to grow into the thousands.  Sort algorithms
  on flat files can be very time-consuming.
  
  The maximum practical size of a dbm database depends on many
  factors--your data, your hardware and the desired response times of
  course included--but as a rough guide consider 5,000 to 10,000 records
  to be reasonable.
  
  We will talk mostly about the Berkley DB version 1.x, as it provides
  the best functionality while having a good speed and almost no
  limitations. Other implementations might be faster in some cases, but
  they are either limited in the length of the maximum value or the
  total number of records.
  
  There is a number of Perl interfaces to the major dbm implementations,
  to list a few: C<DB_File>, C<NDBM_File>, C<ODBM_File>, C<GDBM_File>,
  and C<SDBM_File>.  The original Perl module for Berkeley DB was
  DB_File, which was written to interface to Berkeley DB version
  1.85. The newer Perl module for Berkeley DB is C<BerkeleyDB>, which
  was written to interface to version 2.0 and subsequent
  releases. Because Berkeley DB version 2.X has a compatibility API for
  version 1.85, you can (and should!) build C<DB_File> using version 2.X
  of Berkeley DB, although C<DB_File> will still only support the 1.85
  functionality.
  
  Several different indexing algorithms (known also as access methods)
  can be used with dbm implementations:
  
  =over
  
  =item *
  
  The C<HASH> access method gives an C<0(1)> complexity of search and
  update, fast insert and delete, but a slow sort (which you have to
  implement yourself). (Used by almost all dbm implementations)
  
  =item *
  
  The C<BTREE> access method allows arbitrary key/value pairs to be
  stored in a sorted, balanced binary tree.  This allows us to get a
  sorted sequence of data pairs in C<0(1)>, but at the expense of much
  slower insert, update, delete operations than is the case with
  C<HASH>.  (Available mostly in Berkeley DB)
  
  =item *
  
  The C<RECNO> access method is more complicated, and enables both
  fixed-length and variable-length flat text files to be manipulated
  using the same key/value pair interface as in C<HASH> and C<BTREE>.
  In this case the key will consist of a record (line) number.
  (Available mostly in Berkeley DB)
  
  =item *
  
  The C<QUEUE> access method stores fixed-length records with logical
  record numbers as keys. It is designed for fast inserts at the tail
  and has a special cursor consume operation that deletes and returns a
  record from the head of the queue. The C<QUEUE> access method uses
  record level locking.  (Available only in Berkeley DB version 3.x)
  
  =back
  
  Most often you will want to use the C<HASH> method, but there are many
  considerations and your choice may be dictated by your application.
  
  In recent years dbm databases have been extended to allow you to store
  more complex values, including data structures.  The C<MLDBM> module
  can store and restore the whole symbol table of your script, including
  arrays and hashes.
  
  It is important to note that you cannot simply switch a dbm file from
  one storage algorithm to another.  The only way to change the
  algorithm is to copy all the records one by one into a new dbm file,
  which was initialized according to a desired access method.  You can
  use a script like this:
  
    #!/usr/bin/perl -w
    
    #
    # This script takes as its parameters a list of Berkeley DB
    # file(s) which are stored with the DB_BTREE algorithm.  It
    # will back them up using the .bak extension and create
    # instead dbms with the same records but stored using the
    # DB_HASH algorithm
    #
    # Usage: btree2hash.pl filename(s)
    
    use strict;
    use DB_File;
    use Fcntl;
    
      # Do checks 
    die "Usage: btree2hash.pl filename(s))\n" unless @ARGV;
    
    foreach my $filename (@ARGV) {
    
      die "Can't find $filename: $!\n" 
        unless -e $filename and -r $filename;
    
        # First backup the file
      rename "$filename", "$filename.btree" 
        or die "can't rename $filename $filename.btree:$!\n";
    
        # tie both dbs (db_hash is a fresh one!)
      tie my %btree , 'DB_File',"$filename.btree", O_RDWR|O_CREAT, 
          0660, $DB_BTREE or die "Can't tie $filename.btree: $!";
      tie my %hash ,  'DB_File',"$filename" , O_RDWR|O_CREAT, 
          0660, $DB_HASH  or die "Can't tie $filename: $!";
    
        # copy DB
      %hash = %btree;
    
        # untie
      untie %btree ;
      untie %hash ;
    }
  
  Note that some dbm implementations come with other conversion
  utilities as well.
  
  =head1 mod_perl and dbm
  
  Where does mod_perl fit into the picture?
  
  If you need to access a dbm file in your mod_perl code in the read
  only mode the operation would be much faster if you keep the dbm file
  open (tied) all the time and therefore ready to be used. This will
  work with dynamic (read/write) databases accesses as well, but you
  need to use locking and data flushing to avoid data corruption.
  
  Although mod_perl and dbm can give huge performance gains compared to
  the use of flat file databases you should be very careful.  In
  addition to the need for locking, you need to consider the
  consequences of C<die()> and unexpected process death.
  
  If your locking mechanism cannot handle dropped locks, a stale lock
  can deactivate your whole site.  You can enter a deadlock situation if
  two processes simultaneously try to acquire locks on two separate
  databases.  Each has locked only one of the databases, and cannot
  continue without locking the second.  Yet this will never be freed
  because it is locked by the other process.  If your processes all ask
  for their DB files in the same order, this situation cannot occur.
  
  If you modify the DB you should be make very sure that you flush the
  data and synchronize it, especially when the process serving your
  handler unexpectedly dies.  In general your application should be
  tested very thoroughly before you put it into production to handle
  important data.
  
  =head1 Locking dbm Handlers and Write Lock Starvation Hazards
  
  One has to deploy dbm file locking if there is chance that some
  process will want to write to it. Note that once you need to do
  locking you do it even when all you want is to read from the
  file. Since if you don't, it's possible that someone writes to the
  file at this very moment and you may read partly updated data.
  
  Therefore we should distinguish between I<READ> and I<WRITE>
  locks. Before doing an operation on the dbm file, we first issue
  either a I<READ> or a I<WRITE> lock request, according to our needs.
  
  If we are making a I<READ> lock request, it is granted as soon as the
  I<WRITE> lock on the file is removed if any or if it is already
  I<READ> locked.  The lock status becomes I<READ> on success.
  
  If we make a I<WRITE> lock request, it is granted as soon as the file
  becomes unlocked.  The lock status becomes I<WRITE> on success.
  
  The treatment of the I<WRITE> lock request is most important.
  
  If the DB is I<READ> locked, a process that makes a I<WRITE> request
  will poll until there are no reading or writing processes left.  Lots
  of processes can successfully read the file, since they do not block
  each other.  This means that a process that wants to write to the file
  may never get a chance to squeeze in, since it needs to obtain an
  exclusive lock.
  
  The following diagram represents a possible scenario where everybody
  can read but no one can write (pX's represent different processes
  running for different times and all acquiring the read lock on the dbm
  file):
  
    [-p1-]                 [--p1--]
       [--p2--]                [--p2--]
     [---------p3---------] [--------p3----....
                   [------p4-----]
  
  The result is a starving process, which will timeout the request, and
  it will fail to update the DB.  Ken Williams solved the above problem
  with his L<C<Tie::DB_Lock>|dbm/mod_perl_and_dbm_files_> module, which is
  discussed in one of the following sections.
  
  There are several locking wrappers for C<DB_File> in CPAN right now.
  Each one implements locking differently and has different goals in
  mind.  It is therefore worth knowing the difference, so that you can
  pick the right one for your application.
  
  
  =head1 Flawed Locking Methods Which Must Not Be Used
  
  I<Caution>: The suggested locking methods in the Camel book and
  C<DB_File> man page (before version 1.72, fixed in 1.73) are flawed.
  If you use them in an environment where more than one process can
  modify the dbm file, it can get corrupted!!!  The following is an
  explanation of why this happens.
  
  You may not use a tied file's filehandle for locking, since you get
  the filehandle after the file has been already tied.  It's too late to
  lock.  The problem is that the database file is locked I<after> it is
  opened.  When the database is opened, the first 4k (in Berkley dbm
  library) is read and then cached in memory.  Therefore, a process can
  open the database file, cache the first 4k, and then block while
  another process writes to the file.  If the second process modifies
  the first 4k of the file, when the original process gets the lock is
  now has an inconsistent view of the database.  If it writes using this
  view it may easily corrupt the database on disk.
  
  This problem can be difficult to trace because it does not cause
  corruption every time a process has to wait for a lock.  One can do
  quite a bit of writing to a database file without actually changing
  the first 4k.  But once you suspect this problem you can easily
  reproduce it by making your program modify the records in the first 4k
  of the DB.
  
  You better resort to using the standard modules for locking instead of
  inventing your own.
  
  If your dbm file is used only in the read-only mode generally there is
  no need for locking at all. If you access the dbm file in read/write
  mode, the safest method is to tie() the dbm file after acquiring an
  external lock and untie() before the lock is released. So to access
  the file in shared mode (FLOCK_SH) one should following this pseudo-code:
  
    flock FLOCK_SH <===== start critical section
    tie()
    read...
    untie()
    flock FLOCK_UN <===== end critical section
  
  Similar for the exclusive (EX), write access:
  
    flock FLOCK_EX <===== start critical section
    tie()
    write...
    sync()
    untie()
    flock FLOCK_UN <===== end critical section
  
  However you might want to save a few tie()/untie() calls if the same
  request accesses the dbm file more than once. You should be careful
  though. Based on the caching effect explained above, a process can
  perform an atomic downgrade of an exclusive lock to a shared one
  without re-tie()ing the file:
  
    flock FLOCK_EX <===== start critical section
    tie()
    write...
    sync()
                   <===== end critical section
    flock FLOCK_SH <===== start critical section
    read...
    untie()
    flock FLOCK_UN <===== end critical section
  
  because it has the updated data in its cache. By atomic, we mean it's
  ensured that the lock status gets changed, without any other process
  getting an exclusive access in between.
  
  If you can ensure that one process safely upgrades a shared lock with
  an exclusive lock, one can save on tie()/untie(). But this operation
  might lead to a dead-lock if two processes try to upgrade a shared
  lock with exclusive at the same time. Remember that in order to
  acquire an exclusive lock, all other processes need to release *all*
  locks. If your OS locking implementation resolves this deadlock by
  denying one of the upgrade requests, make sure your program handles
  that appropriately. The process that were denied has to untie() the
  dbm file and then ask for an exclusive lock.
  
  A dbm file has always to be untie()'ed before the locking is released
  (unless you do an atomic downgrade from exclusive to shared as we have
  just explained). Remember that if at any given moment a process wants
  to lock and access the dbm file it has to re-tie() this file, if it
  was tied already. If this is not done, the integrity of the dbm file
  is not ensured.
  
  To conclude, the safest method of reading from dbm file is to lock the
  file before tie()-ing it, untie() it before releasing the lock, and
  in the case of write to call sync() before untie()-ing it.
  
  
  
  
  =head1 Locking Wrappers Overview
  
  Here are some of the correctly working dbm locking wrappers on (three
  of them are available from CPAN):
  
  =over
  
  =item *
  
  C<Tie::DB_Lock> -- C<DB_File> wrapper which creates copies of the dbm
  file for read access, so that you have kind of a multiversioning
  concurrent read system. However, updates are still serial. After each
  update the read-only copies of the dbm file are recreated. Use this
  wrapper in situations where reads may be very lengthy and therefore
  write starvation problem may occur. On the other hand if you have big
  dbm files, it may create a big load on the system if the updates are
  quite frequent.  L<More information|dbm/mod_perl_and_dbm_files_>.
  
  =item *
  
  C<Tie::DB_FileLock> -- C<DB_File> wrapper that has the ability to lock
  and unlock the database while it is being used.  Avoids the
  tie-before-flock problem by simply re-tie-ing the database when you
  get or drop a lock.  Because of the flexibility in dropping and
  re-acquiring the lock in the middle of a session, this can be used in
  a system that will work with long updates and/or reads.  Refer to the
  C<Tie::DB_FileLock> manpage for more information.
  
  =item *
  
  C<DB_File::Lock> -- extremely lightweight C<DB_File> wrapper that
  simply flocks an external lockfile before tie-ing the database and
  drops the lock after untie.  Allows one to use the same lockfile for
  multiple databases to avoid deadlock problems, if desired.  Use this
  for databases where updates and reads are quick and simple flock
  locking semantics are enough.  Refer to C<DB_File::Lock> manpage for
  more information.
  
  =item *
  
  L<C<DB_File::Lock2>|dbm/mod_perl_and_dbm_files_> -- does the same thing as
  C<DB_File::Lock>, but has a slightly different implementation.  I
  wrote it before David Harris released his C<DB_File::Lock> and I
  didn't want to kill mine, so I'll keep it here for a while :).
  
  =item *
  
  On some Operating Systems (FreeBSD is one example) it is possible to
  lock on tie:
  
    tie my %t, 'DB_File', $TOK_FILE, O_RDWR | O_EXLOCK, 0664;
  
  and only release the lock by un-tie()-ing the file.  Check if the
  C<O_EXLOCK> flag is available on your operating system before you try
  to use this method!
  
  =back
  
  
  =head1 Tie::DB_Lock
  
  C<Tie::DB_Lock> ties hashes to databases using shared and exclusive
  locks.  This module, by Ken Williams, solves the problems raised in
  the previous section.
  
  The main difference from what I have described above is that
  C<Tie::DB_Lock> copies a dbm file on read.  Reading processes do not
  have to keep the file locked while they read it, and writing processes
  can still access the file while others are reading.  This works best
  when you have lots of long-duration reading, and a few short bursts of
  writing.
  
  The drawback of this module is the heavy IO performed when every
  reader makes a fresh copy of the DB.  With big dbm files this can be
  quite a disadvantage and can slow the server down considerably.
  
  An alternative would be to have one copy of the dbm image shared by
  all the reading processes.  This can cut the number of files that are
  copied, and puts the responsibility of copying the read-only file on
  the writer, not the reader.  It would need some care to make sure it
  does not disturb readers when putting a new read-only copy into place.
  
  
  
  
  =head1 DB_File::Lock2
  
  =code DB_File-Lock2.pm Here is C<DB_File::Lock2> which does the
  locking by using an external lockfile.
  
  This allows you to gain the lock before the file is tied.  Note that
  it's not yet on CPAN and so is linked from here in its entirety.  Note
  also that this code still needs some testing, so I<be careful> if you
  use it on a production machine.
  
  You use it like this:
  
    use DB_File::Lock2 ();
  
  A simple tie, READ lock and untie
  
    use DB_File::Lock2 ();
    my $dbfile = "/tmp/test";
    tie my %mydb, 'DB_File::Lock2', $dbfile, 'read';
    print $mydb{foo} if exists $mydb{foo};
    untie %mydb;
  
  You can even skip the C<untie()> call.  When C<$mydb> goes out of
  scope everything will be done automatically.  However it is better to
  use the explicit call, to make sure the critical sections between lock
  and unlock are as short as possible.  This is especially important
  when requesting an exclusive (write) lock.
  
  The following example shows how it might be convenient to skip the
  explicit C<untie()>.  In this example, we don't need to save the
  intermediate result, we just return and the cleanup is done
  automatically.
  
    use DB_File::Lock2 ();
    my $dbfile = "/tmp/test";
    print user_exists("stas") ? "Yes" : "No";
    sub user_exists{
      my $username = shift || '';
    
      warn("No username passed\n"), return 0 unless $username;
    
      tie my %mydb, 'DB_File::Lock2', $dbfile, 'read';
    
      # if we match the username return 1, else 0
      return $mydb{$username} ? 1 : 0;
    
    } # end of sub user_exists
  
  Now let's write all the upper case characters and their respective
  ASCII values to a dbm file. Then read the file and print the contents
  of the DB, unsorted.
  
    use DB_File::Lock2 ();
    my $dbfile = "/tmp/test";
    
      # write 
    tie my %mydb, 'DB_File::Lock2', $dbfile,'write';
    for (0..26) {
      $mydb{chr 65+$_} = $_;
    }
    untie %mydb;
    
      # now, read them and printout (unsorted) 
      # notice that 'read' is a default lock mode
    tie %mydb, 'DB_File::Lock2', $dbfile;
    while (my($k,$v) = each %mydb) {
      print "$k => $v\n";
    }
    untie %mydb;
  
  If your CGI script is interrupted, the C<DESTROY> block will take care
  of unlocking the dbm file and flush any changes.  So your DB will be
  safe against possible corruption because of unclean program
  termination.
  
  =cut
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/debug.pod
  
  Index: debug.pod
  ===================================================================
  =head1 NAME
  
  Debugging mod_perl
  
  
  =head1 Warning and Errors Explained
  
  Let's talk first about things that bother most web (and non-web)
  programmers. I<The bothering things> are warning and errors reported
  by Perl. We are going to learn how to take the best out of both, by
  turning this obvious to the newbie programmer enemies into our best
  friends.
  
  
  
  =head2 Curing The "Internal Server Error"
  
  You have just installed this new CGI script and when you try it out
  you see the grey screen of death saying "Internal Server Error"... Or
  even worse you have a script running on a production server for a long
  time without problems, when the same grey screen starts to show up
  occasionally for no apparent reason.
  
  How can we find out what the problem is?
  
  First problem:
  
  You have been coding in Perl for years, and whenever an error occurred
  in the past it was displayed in the same terminal window that you
  started the script from.  But when you work with a webserver there is
  no terminal to show you the errors, since the server in most cases has
  no terminal to send the error messages to.
  
  Actually, the error messages don't disappear, they end up in the
  I<error_log> file.  It is located in the directory specified by the
  C<ErrorLog> directive in I<httpd.conf>.  The default setting is
  generally:
  
    ErrorLog /usr/local/apache/logs/error_log
  
  So whenever you see I<"Internal Server Error"> it's time to look at
  this file.
  
  First problem solved!
  
  There are cases when errors don't go to the error_log file.  For
  example, some errors go to the httpd process' STDERR.  If you haven't
  redirected httpd's STDERR then the messages are printed to the console
  (tty, terminal) from which you executed the httpd.  This happens when
  the server didn't get as far as opening the error_log file for writing
  before it needed to write an error message.
  
  For example, if you have entered a non-existent directory path in your
  C<ErrorLog> directive, the error message will be printed to STDERR.
  If the error happens when the server executes a C<PerlRequire> or
  C<PerlModule> directive you might also see output sent to STDERR.
  
  You are probably wondering where all the errors go when you are
  running the server in single process mode (C<httpd -X>).  They go to
  STDERR.  This is because the error logging for all the httpd children
  is normally done by the parent httpd.  When httpd runs in single
  process mode, it has no parent httpd process to perform all the
  logging.  The output to the terminal includes all the status messages
  that normally go to the error_log file.
  
  Finally with a C<PerlLogHandler> you can take away from Apache its
  control of the error logging process for all HTTP transactions.  If
  you do this, then you are responsible for generating and storing the
  error messages.  You can do whatever you like with the information,
  (including throwing it away -- don't do it!) and, depending on how you
  implement you C<LogHandler>, the C<ErrorLog> directive may have no
  effect. But you can also do something at this handler and then return
  C<DECLINED> status, so the default Apache LogHandler will do the work
  as usual.
  
  Second problem:
  
  The usefulness of the error message depends to some extent on the
  programmer's coding style.  An uninformative message might not help
  you to spot and fix the error.
  
  For example, let's take a function which opens a file passed to it as
  a parameter.  It does nothing else with the file.  Here's our first
  version of the code:
  
    my $r = shift;
    $r->send_http_header('text/plain');
    
    sub open_file{
      my $filename = shift || '';
      die "No filename passed!" unless $filename;
    
      open FILE, $filename or die;
    }
    
    open_file("/tmp/test.txt");
  
  Let's assume that C</tmp/test.txt> doesn't exist so the open() will
  fail to open the file.  When we call this script from our browser, the
  browser returns an I<"internal error"> message and we see the
  following error appended to I<error_log>:
  
    Died at /home/httpd/perl/test.pl line 9.
  
  We can use the hint Perl kindly gave to us to find where in the code
  the die() was called.  However, we still don't know what filename was
  passed to this subroutine to cause the program termination.
  
  If we have only one function call as in the example above, the task of
  finding the problematic filename will be trivial.  Now let's add two
  more open_file() function calls and assume that among the three files
  only I</tmp/test2.txt> exists:
  
    open_file("/tmp/test.txt");
    open_file("/tmp/test2.txt");
    open_file("/tmp/test3.txt");
  
  When you execute the above call, you will see the same error message
  twice:
  
    Died at /home/httpd/perl/test.pl line 9.
    Died at /home/httpd/perl/test.pl line 9.
  
  Based on this error message, can you tell what files your program
  failed to open?  Probably not.  Let's fix it by passing the name of
  the file to die():
  
    sub open_file{
      my $filename = shift || '';
      die "No filename passed!" unless $filename;
      open FILE, $filename or die "failed to open $filename";
    }
    
    open_file("/tmp/test.txt");
  
  When we execute the above code, we see:
  
    failed to open /tmp/test.txt at /home/httpd/perl/test.pl line 9.
  
  which makes a big difference.
  
  By the way, if you append a newline to the end of the message you pass
  to die(), Perl won't report the line number the error has happened
  at, so if you code:
  
    open FILE, $filename or die "failed to open a file\n";
  
  The error message will be:
  
    failed to open a file
  
  Which gives you very little to go on.  It's very hard to debug with
  such uninformative error messages.
  
  The warn() function, a kinder sister of die(), which logs the message
  but doesn't cause program termination, behaves in the same way.  If
  you add a newline to the end of the message, the line number warn()
  was called at won't be logged, otherwise it will.
  
  You might want to use warn() instead of die() if the failure isn't
  critical.  Consider the following code:
  
    if(open FILE, $filename){
      # do something with file
    } else {
      warn "failed to open $filename";
    } 
    # more code here...
  
  Now we've improved our code, by reporting the names of the problematic
  files, but we still don't know the reason for the failure.  Let's try
  to improve the warn() example.  The C<-r> operator tests whether the
  file is readable:
  
    if(-r $filename){
      open FILE, $filename;
      # do something with file
    } else {
      warn "Couldn't open $filename - doesn't exist or is not readable";
    } 
  
  Now if we cannot read the file we do not even try to open it.  But we
  still see a warning in error_log:
  
    Couldn't open /tmp/test.txt - doesn't exist or is not readable
    at /home/httpd/perl/test.pl line 9.
  
  The warning tells us the reason for the failure, so we don't have to
  go to the code and check what it was trying to do with the file.
  
  It could be quite a coding overhead to explain all the possible
  failure reasons that way, but why reinvent the wheel?  We already have
  the reason for the failure stored in the C<$!> variable.  Let's go
  back to the open_file() function:
  
    sub open_file{
      my $filename = shift || '';
      die "No filename passed!" unless $filename;
      open FILE, $filename or die "failed to open $filename: $!";
    }
    
    open_file("/tmp/test.txt");
  
  This time, if open() fails we see:
  
    failed to open /tmp/test.txt: No such file or directory
    at /home/httpd/perl/test.pl line 9.
  
  Now we have all the information we need to debug these problems: we
  know what line of code triggered die(), we know what file we were
  trying to open, and last but not least we know the reason, given to us
  through Perl's C<$!> variable.
  
  Now let's create the file I</tmp/test.txt>.
  
    % touch /tmp/test.txt
  
  When we execute the latest version of the code, we see:
  
    failed to open /tmp/test.txt: Permission denied
    at /home/httpd/perl/test.pl line 9.
  
  Here we see a different reason: we created a file that doesn't belong
  to the user which the server runs as (usually I<nobody>).  It does not
  have permission to read the file.
  
  Now you can see that it's much easier to debug your code if you
  validate the return values of the system calls, and properly code
  arguments to die() and warn() calls. The open() function is just one of
  the many system calls perl provides to your convenience.
  
  So now you can code and debug CGI scripts and modules as easily as if
  they were plain Perl scripts that you execute from a shell.
  
  Second problem solved!
  
  
  
  
  
  
  
  
  
  =head2 Helping error_log to Help Us
  
  It's a good idea to keep it open all the time in a dedicated terminal
  with the help of I<tail -f> or I<less -S>, whichever you prefer (the
  latter allows you to page around the file, search etc.)
  
    % tail -f /usr/local/apache/logs/error_log
  
  or
  
    % less -S /usr/local/apache/logs/error_log
  
  So you will see all the errors and warning as they happen.
  
  Another tip is to create a shell I<alias>, to make it easier to
  execute the above command. In tcsh you would do something like this:
  
    % alias err "tail -f /usr/local/apache/logs/error_log"
  
  For bash users the command is:
  
    % alias err='tail -f /var/log/apache/error.log'
  
  and from now on in the shell you set the alias in, executing
  
    % err
  
  will call I<tail -f /usr/local/apache/logs/error_log>.  Since you want
  this alias to be available to you all the time, you should put it into
  your I<.tcshrc> file or its equivalent.  For I<bash> users this is
  I<.bashrc>, or you can put it in I</etc/profile> for use by all users.
  
  If you cannot access your I<error_log> file because you are unable to
  telnet to your machine (generally the case when an ISP provides
  user CGI support but no telnet access), you might want to use a CGI
  script I wrote to fetch the latest lines from the file (with a bonus
  of colored output for easier reading).  You might need to ask your ISP
  to install this script for general use. See L<Watching the error_log
  file without telneting to the
  server|snippets/Watching_the_error_log_File_With> .
  
  
  
  
  =head2 The Importance of Warnings
  
  Just like errors, Perl's mandatory warnings go to the I<error_log>
  file, if the they are enabled.  Of course you have enabled them in
  your development server, haven't you?
  
  The code you write lives a dual life.  In the first life it's being
  written, tested, debugged, improved, tested, debugged, rewritten,
  tested, debugged.  In the second life it's I<just> used.
  
  A significant part of the script's first life is spent on the
  developer's machine.  The other part is spent on the production server
  where the creature is supposed to be perfect.
  
  So when you develop the code you want all the help in the world to
  help you spot possible problems, and that's where enabling warnings is
  a must.  Whenever you see an error or warning in the I<error_log>, you
  want to get rid of it.  That's very important.
  
  Why?
  
  =over
  
  =item *
  
  If there are warnings, your code is not clean.  If they are waved
  away, expect them to come back on the production server in the form of
  errors, when it's too late.
  
  =item *
  
  If each invocation of a script generates more than about five lines of
  warnings, it will be very hard to catch real problems.  You just can't
  see them among all the other warnings which you used to think were
  unimportant.
  
  =back
  
  On the other hand, on a production server, you really I<want> to turn
  warnings off.  And there are good reasons for that:
  
  =over
  
  =item *
  
  There is no added value in having the same warning showing up, again
  and again, triggered by thousands of script invocations.  If your code
  isn't very clean and generates even a single warning per script
  invocation, on the heavily loaded server you will end up with a huge
  I<error_log> file in a short time.
  
  The warning elimination phase is supposed to be a part of the
  development process, and should be done before the code goes live.
  
  =item *
  
  In any Perl script, not just under mod_perl, enabling runtime warnings
  has a performance impact.
  
  =back
  
  mod_perl gives you a very simple solution to this warnings saga, don't
  enable warnings in the scripts unless you really have to.  Let
  mod_perl control this mode globally.  All you need to do is put the
  directive
  
    PerlWarn On
  
  in I<httpd.conf> on your development machine and the directive
  
    PerlWarn Off
  
  on the live box.
  
  If there is a piece of code that generates warnings and you want to
  disable them only in this code, you can do that too.  The Perl special
  variable C<$^W> allows you dynamically to turn on and off warnings
  mode.  So just put the code into a block, and disable the warnings in
  the scope of this block.  The original value of C<$^W> will be restored
  upon exit from the block.
  
    {
     local $^W=0;
      # some code that generates innocuous warnings
    }
  
  Unless you have a really good reason, for your own sake the advice is
  I<avoid this technique>.
  
  Don't forget the C<local()> operand!  If you do, setting C<$^W> will
  affect B<all> the requests handled by the Apache child that changed
  this variable.  And for B<all> the scripts it executes, not just the
  one which changed C<$^W>!
  
  The C<diagnostics> pragma can shed more light on errors and warnings,
  as you will see in a moment.
  
  
  
  
  
  
  
  
  
  
  
  
  
  =head3 diagnostics pragma
  
  This module extends the terse diagnostics normally emitted by both the
  Perl compiler and the Perl interpreter, augmenting them with the more
  verbose and endearing descriptions found in the C<perldiag> manpage.
  Like the other pragmata, it affects the compilation phase of your
  scripts as well as the execution phase.
  
  To use in your program as a pragma, merely invoke
  
      use diagnostics;
  
  at or near the start of your program.  This also turns on C<-w> mode.
  
  This pragma is especially useful when you are new to perl, and want a
  better explanation of the errors and warnings.  It's also helpful when
  you encounter some warning you've never seen before, e.g. when a new
  warning has been introduced in an upgraded version of Perl.
  
  You may not want to leave C<diagnostics> mode On for your production
  server.  For each warning, C<diagnostics> mode generates ten times
  more output than warnings mode.  If your code generates warnings, with
  the C<diagnostics> pragma you will use disk space much faster.
  
  C<diagnostics> mode adds a large performance overhead in comparison
  with just having warnings mode On.  You can see the benchmark results
  in the section 'L<Code Profiling
  Techniques|performance/Code_Profiling_Techniques>'.
  
  
  
  
  
  
  
  
  =head1 Handling the 'User pressed Stop button' case
  
  When a user presses a B<STOP> or B<RELOAD> button, the current socket
  connection goes broken (aborted). It would be nice if Apache could
  always immediately detect this event. Unfortunately there is no way to
  tell whether the connection is still valid unless an attempt to read
  from or write to connection is made.
  
  Unfortunately the detection technique we are going to present doesn't
  work if the connection to the back-end mod_perl server is coming from
  the front-end mod_proxy, as the latter doesn't break the connection to
  the back-end when user has aborted the connection.
  
  If the reading of the request's data is completed and the code does
  processing without writing anything back to the client the broken
  connection won't be noticed. When an attempt is made to send at least
  one character to the client, the broken connection would be noticed
  and the C<SIGPIPE> signal (Broken pipe) would be sent to the
  process. The program could then halt its execution and perform all the
  cleanup stuff it has to do.
  
  Prior to Apache version 1.3.6, C<SIGPIPE> was handled by Apache.
  Currently Apache is not handling SIGPIPE anymore and mod_perl takes
  care of it.
  
  Under mod_perl, C<$r-E<gt>print> (or just print()) returns a I<true>
  value on success, a I<false> value on failure. The latter usually
  happens when the connection is broken.
  
  If you want a similar to the old C<SIGPIPE> behaviour (as it was
  before Apache version 1.3.6), add the following configuration
  directive:
  
    PerlFixupHandler Apache::SIG
  
  When Apache's C<SIGPIPE> handler is used, Perl may be left in the
  middle of it's eval context, causing bizarre errors during subsequent
  requests are handled by that child.  When C<Apache::SIG> is used, it
  installs a different C<SIGPIPE> handler which rewinds the context to
  make sure Perl is back to normal state, preventing these bizarre
  errors.
  
  But in general case, you don't need to use the above setting.
  
  If you use this setting and you would like to log when a request was
  canceled by a SIGPIPE in your Apache I<access_log>, you must define a
  custom C<LogFormat> in your I<httpd.conf>, like so:
  
    PerlFixupHandler Apache::SIG
    LogFormat "%h %l %u %t \"%r\" %s %b %{SIGPIPE}e"
  
  If the server has noticed that the request was canceled via a
  C<SIGPIPE>, then the log line will end with C<1>, otherwise it will
  just be a dash. e.g.:
  
    127.0.0.1 - - [09/Jan/2001:10:27:15 +0100] 
    "GET /perl/stopping_detector.pl HTTP/1.0" 200 16 1
    127.0.0.1 - - [09/Jan/2001:10:28:18 +0100] 
    "GET /perl/test.pl HTTP/1.0"              200 10 -
  
  
  
  =head2 Detecting Aborted Connections
  
  Let's use the knowledge we have acquired to trace the execution of the
  code and watch all the events as they happen. 
  
  Let's take a little script that obviously I<"hangs"> the server
  process:
  
    stopping_detector.pl
    --------------------
    my $r = shift;
    $r->send_http_header('text/plain');
    
    print "PID = $$\n";
    $r->rflush;
    
    while(1){
      $i++;
      sleep 1;
    }
  
  The script gets a request object C<$r> by shift()ing it from the C<@_>
  argument list passed by the handler() subroutine.  (This magic is done
  by C<Apache::Registry>).  Then the script sends a C<Content-type>
  header, telling the client that we are going to send a plain text as a
  response.
  
  Next the script prints out a single line telling us the id of the
  process that handles this request, which we need to know in order to
  run the tracing utility.  Then we flush Apache's buffer.  (If we don't
  flush the buffer we will never see this short information printed.
  That's because our output is shorter than the buffer size and the
  script intentionally hangs, so the buffer won't be auto-flushed as the
  script hangs at the end.)
  
  Then we enter an infinite C<while(1)> loop, which just increments a
  dummy variable and sleeps for a second.
  
  Running C<strace -p PID>, where I<PID> is the process ID as printed to
  the browser, we see the following output printed every second:
  
    SYS_175(0, 0xbffff41c, 0xbffff39c, 0x8, 0) = 0
    SYS_174(0x11, 0, 0xbffff1a0, 0x8, 0x11)    = 0
    SYS_175(0x2, 0xbffff39c, 0, 0x8, 0x2)      = 0
    nanosleep(0xbffff308, 0xbffff308, 
              0x401a61b4, 0xbffff308, 0xbffff41c) = 0
    time([941281947])                     = 941281947
    time([941281947])                     = 941281947
  
  Let's leave C<strace> running and press the B<STOP> button.  Did
  anything change?  No, the same system calls trace is printed every
  second.  Which means that Apache didn't detect the broken connection.
  
  Now we are going to write the C<\0> (NULL) character to the client in
  attempt to detect the broken connection as close as possible to the
  time the B<Stop> button is pressed at. Therefore we modify the loop
  code in the following way:
  
    while(1){
      $r->print("\0");
      last if $r->connection->aborted;
      $i++;
      sleep 1;
    }
  
  We add a print() statement to print a NULL character and then we check
  whether the connection was aborted with help of the
  C<$r-E<gt>connection-E<gt>aborted> method. If the connection is
  broken, we break out of the loop.
  
  We run this script and strace on it as before, but we see that it
  still doesn't work.  The trouble is we aren't flushing the buffer,
  which leaves the characters in the buffer and they won't be printed
  before the buffer will get full and will be autoflushed.  Since we
  want to attempt to write to the connection pipe all the time, after
  printing the NULL, we add $r-E<gt>rflush(). Here is a new version of the
  code:
  
    stopping_detector2.pl
    ---------------------
    my $r = shift;
    $r->send_http_header('text/plain');
    
    print "PID = $$\n";
    $r->rflush;
    
    while(1){
      $r->print("\0");
      $r->rflush;
    
      last if $r->connection->aborted;
    
      $i++;
      sleep 1;
    }
  
  After starting the C<strace> utility on the running process as we did
  before and pressing the B<Stop> button, we have seen the following
  output.
  
    SYS_175(0, 0xbffff41c, 0xbffff39c, 0x8, 0) = 0
    SYS_174(0x11, 0, 0xbffff1a0, 0x8, 0x11) = 0
    SYS_175(0x2, 0xbffff39c, 0, 0x8, 0x2)   = 0
    nanosleep(0xbffff308, 0xbffff308, 0x401a61b4, 0xbffff308, 0xbffff41c) = 0
    time([941284358])                       = 941284358
    write(4, "\0", 1)                       = -1 EPIPE (Broken pipe)
    --- SIGPIPE (Broken pipe) ---
    select(5, [4], NULL, NULL, {0, 0})      = 1 (in [4], left {0, 0})
    time(NULL)                              = 941284358
    write(17, "127.0.0.1 - - [30/Oct/1999:13:52"..., 81) = 81
    gettimeofday({941284359, 39113}, NULL)  = 0
    times({tms_utime=9, tms_stime=8, tms_cutime=0, tms_cstime=0}) = 41551400
    close(4)                                = 0
    SYS_174(0xa, 0xbffff4e0, 0xbffff454, 0x8, 0xa) = 0
    SYS_174(0xe, 0xbffff46c, 0xbffff3e0, 0x8, 0xe) = 0
    fcntl(18, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}
  
  Apache detects the broken pipe as you see from this snippet:
  
    write(4, "\0", 1)                       = -1 EPIPE (Broken pipe)
    --- SIGPIPE (Broken pipe) ---
  
  Then it stops the script and does all the cleanup work, like access
  logging:
  
    write(17, "127.0.0.1 - - [30/Oct/1999:13:52"..., 81) = 81
  
  where 17 is a file descriptor of the opened I<access_log> file
  
  =head2 The Importance of Cleanup Code
  
  Cleanup code is a critical issue with aborted scripts.
  
  What happens to locked resources if there are any?  Will they be freed
  or not?  If not, scripts using these resources and the same locking
  scheme will hang, waiting for them to be freed.
  
  First let's go one step back and recall what are the problems and
  solutions for this issue under mod_cgi.
  
  Under mod_cgi the resource locking issue is a problem only if you
  happened to create external lock files and use them for lock
  indication, instead of using flock().  If the script running under
  mod_cgi is aborted between the lock and the unlock code, and you
  didn't bother to write cleanup code to remove old dead locks then you
  are in big trouble.
  
  The solution is to use an C<END> block to place the cleanup code in:
  
    END {
      # some code that ensures that locks are removed
    }
  
  When the script is aborted, Apache will run the C<END> blocks.
  
  If you use C<flock()> things are much simpler, since all opened files
  will be closed when the script exits.  When the file is closed, the
  lock is removed as well--all the locked resources get freed.  There
  are systems where flock(2) is unavailable, and for those you can use
  Perl's emulation of this function.
  
  With mod_perl things can be more complex when you use global variables
  as a filehandlers. Because the processes don't exit after processing a
  request, files won't be closed unless you explicitly close() them or
  reopen with the open() call, which first closes a file.  Let's see
  what problems we might encounter, and possible solutions for them.
  
  
  =head3 Critical Section
  
  First we want to make a little detour to discuss the I<"critical
  section"> issue.
  
  Let's start with a resource locking scheme. A schematic representation of
  a proper locking technique is as follows:
  
    1. lock a resource
       <critical section starts>
    2. do something with the resource
       <critical section ends>
    3. unlock the resource
  
  If the locking is exclusive, only one process can hold the resource at
  any given time, which means that all the other processes will have to
  wait, therefore the code between the locking and unlocking functions
  can become a service bottleneck. That's why this code section is
  called critical and once started it should be finished as soon as
  possible.
  
  Even if you use a shared locking scheme, where many processes are
  allowed to concurrently access the resource, if there are processes
  that sometimes want to get an exclusive lock it's also important to
  keep the critical section as short as possible.
  
  The next example uses a shared lock, but has a poorly-designed
  critical section:
  
    critical_section_sh.pl
    -------------------
    use Fcntl qw(:flock);
    use Symbol;
    my $fh = gensym;
    
    open $fh, "/tmp/foo" or die $!;
    flock $fh, LOCK_SH;
      # start critical section
    
    seek $fh, 0, 0;
    my @lines = <$fh>;
    for(@lines){
      print if /foo/;
    }
    
      # end critical section
    close $fh; # close unlocks the file
  
  The code opens the file for reading, locks and rewinds it to the
  beginning, reads all the lines from the file and prints out the lines
  that contain the string I<'foo'>.
  
  The C<gensym()> function imported by the C<Symbol> module creates an
  anonymous glob and returns a reference to it.  Such a glob reference
  can be used as a file or directory handle.  and therefore allows using
  lexically scoped variables as filehandlers. C<Fcntl> imports into the
  script's namespace file locking symbols like: C<LOCK_SH>, C<LOCK_EX>
  and more. Refer to the C<Fcntl> manpage for more information.
  
  If the file the script reads is big, it'd take a relatively long time
  for this code to complete. All this time the file remains open and
  locked. While it's other processes may access this file for reading
  (shared lock), the process that wants to modify the file (which
  requires an acquisition of the exclusive lock), will be blocked waiting
  for this section to complete.
  
  We can optimize the critical section this way:
  
  Once the file has been read, we have all the information we need from
  it.  In order to make the example simpler we've chosen to just print
  out the matching lines. In reality the code might be much longer.
  
  We don't need the file to be open while the loop executes, because we
  don't access it inside the loop.  If we close the file before we start
  the loop, we will allow other processes to have an exclusive access to
  the file if they need it, instead of blocking them for no reason.
  
  In the following corrected version of the previous example, we only
  read the content of the file during the critical section and process
  it afterwards, without creating a possible bottleneck.
  
    critical_section_sh2.pl
    --------------------
    use Fcntl qw(:flock);
    use Symbol;
    my $fh = gensym;
    
    open $fh, "/tmp/foo" or die $!;
    flock $fh, LOCK_SH;
      # start critical section
    
    seek $fh, 0, 0;
    my @lines = <$fh>;
    
      # end critical section
    close $fh; # close unlocks the file
    
    for(@lines){
      print if /foo/;
    }
  
  Here is another similar example, but now it uses an exclusive lock.
  The script reads in a file and writes it back, adding a number of
  new text lines to the head of the file.
  
    critical_section_ex.pl
    --------------------
    use Fcntl qw(:flock);
    use Symbol;
    my $fh = gensym;
    
    open $fh, "+>>/tmp/foo" or die $!;
    flock $fh, LOCK_EX;
    
      # start critical section
    seek $fh, 0, 0;
    my @add_lines =
      (
       qq{Complete documentation for Perl, including FAQ lists,\n},
       qq{should be found on this system using `man perl' or\n},
       qq{`perldoc perl'. If you have access to the Internet, point\n},
       qq{your browser at http://www.perl.com/, the Perl Home Page.\n},
      );
    
    my @lines = (@add_lines, <$fh>);
    seek $fh, 0, 0;
    truncate $fh, 0;
    print $fh @lines;
      # end critical section
    
    close $fh; # close unlocks the file
  
  Since we want to read the file, modify and write it back, without
  anyone else changing it on the way, we open it for read and write with
  the help of I<+E<gt>E<gt>> and lock it with an exclusive lock.  You
  cannot safely accomplish this task by opening the file first for read
  and then reopening for write, since another process might change the
  file between the two events.  (You could get away with I<+E<lt>> as
  well, please refer to the I<perlfunc> manpage for more information
  about the open() function.)
  
  Next, the code prepares the lines of text it wants to add to the
  head of the file, and assigns them and the content of the file to the
  C<@lines> array.  Now we have our data ready to be written back to the
  file, so we seek() to the start of the file and truncate() it to zero
  size.  In our example the file always grows, so in this case there is
  no need to truncate it, but if there was a chance that the file might
  shrink then truncating would be necessary.  However it's good practice
  to always use truncate(), as you never know what changes your code
  might undergo in the future.  The truncate() operation does not carry
  any significant performance penalty.  Finally we write the data back
  to the file and close it, which unlocks it as well.
  
  Did you notice that we created the text lines to be added as close
  to the place of usage as possible?  This complies with good
  I<"locality of code"> style, but it makes the critical section longer.
  In such cases you should sacrifice style, in order to make the
  critical section as short as possible.  An improved version of this
  script with a shorter critical section looks like this:
  
    critical_section_ex2.pl
    --------------------
    use Fcntl qw(:flock);
    use Symbol;
    
    my @lines =
      (
       qq{Complete documentation for Perl, including FAQ lists,\n},
       qq{should be found on this system using `man perl' or\n},
       qq{`perldoc perl'. If you have access to the Internet, point\n},
       qq{your browser at http://www.perl.com/, the Perl Home Page.\n},
      );
    
    my $fh = gensym;
    open $fh, "+>>/tmp/foo" or die $!;
    flock $fh, LOCK_EX;
      # start critical section
    
    seek $fh, 0, 0;
    push @lines, <$fh>;
    
    seek $fh, 0, 0;
    truncate $fh, 0;
    print $fh @lines;
    
      # end critical section
    close $fh; # close unlocks the file
  
  There are two important differences.  First, we prepare the text lines
  to be added I<before> the file is locked.  Second, instead of
  creating a new array and copying lines from one array to another, we
  append the file directly to the C<@lines> array.
  
  
  =head3 Safe Resource Locking and Cleanup Code
  
  Let's get back to the main issue of this section, which is safe
  resource locking.
  
  Unless you use the C<Apache::PerlRun> handler that does the cleanup
  for you, if you don't make a habit of closing all the files that you
  open--in some cases you will encounter lots of problems.  If you open
  a file but don't close it, you may have file descriptor leakage.
  Since the number of file descriptors available to you is finite, at
  some point you may run out of them and your service will fail.  This
  is bad, but you can live with it until you run out of file descriptors
  (which will happen much faster on a heavily used server).
  
  You can use system utilities to observe the opened and locked files,
  as well as the processes that has opened (and locked) the files.  On
  FreeBSD you would use the fstat(1) utility. On many other UN*X flavors
  the lsof(1) utility is available.
  
  But this is nothing compared to the trouble you will give yourself if
  the code terminates and the file stays locked.  Any other process
  requesting a lock on the same file (or resource) will wait
  indefinitely for it to become unlocked.  Since this will not happen
  until the server reboots, all these processes trying to use this
  resource will hang.
  
  Here is an example of such a terrible mistake:
  
    flock.pl
    --------
    use Fcntl qw(:flock);
    open IN, "+>>filename" or die "$!";
    flock IN, LOCK_EX;
      # do something
      # quit without closing and unlocking the file
  
  Is this safe code?  No - we forgot to close the file. So let's add the
  close():
  
    flock2.pl
    ---------
    use Fcntl qw(:flock);
    open IN, "+>>filename" or die "$!";
    flock IN, LOCK_EX;
      # do something
    close IN;
  
  Is it safe code now?  Unfortunately it is not.  There is a chance that
  the user may abort the request (for example by pressing his browser's
  C<Stop> or C<Reload> buttons) during the critical section.  The script
  will be aborted before it has had a chance to close() the file, which
  is just as bad as if we forgot to close it.
  
  In fact if the same process will run the same code again, an open()
  call will close the file first, which will unlock the resource. This
  is because C<IN> is a global variable.  But it's quite possible that
  the process that created the lock, will not serve the same request for
  a while, since it would be busy serving other requests. So relying on
  it to reopen the file is a bad idea.
  
  This problem happens B<only> if you use global variables as file
  handles. The following example has the same problem.
  
    flock3.pl
    ---------
    use Fcntl qw(:flock);
    use Symbol ();
    use vars qw($fh);
    $fh = Symbol::gensym();
    open $fh, "+>>filename" or die "$!";
    flock $fh, LOCK_EX;
      # do something
    close $fh;
  
  C<$fh> is still a global variable and therefore the code using it
  suffers from the same problem.
  
  The simplest solution to this problem is to always use lexically
  scoped variables (created with my()).  Whether script gets aborted
  before close() is called or you forgot the use close() the lexically
  scoped variable will always go out of scope and therefore if the file
  was locked it will be unlocked. Here is a good version of the code:
  
    flock4.pl
    ---------
    use Fcntl qw(:flock);
    use Symbol ();
    my $fh = Symbol::gensym();
    open $fh, "+>>filename" or die "$!";
    flock $fh, LOCK_EX;
      # do something
    close $fh;
  
  Please don't conclude from this example that you don't have to close
  files anymore, since they will be automatically closed for you. It's a
  bad style and should be avoided.
  
  mod_perl comes with its own implementation of gensym(), so you don't
  even need to load the Symbol module in order to use this function. In
  mod_perl this function resides in the C<Apache> package. For example:
  
    use Apache;
    my $fh = Apache::gensym();
    open $fh, "+>>filename" or die "$!";
    ...
  
  If you insist on using the file globs, at least make sure that you
  local()'ize these, and then if the flow of the code is interrupted
  before close() was called the filehandle will be automatically closed,
  since the local()'ized variable will go out of the scope. The
  following example shows that the file is indeed closed even when there
  is no close():
  
    /tmp/io.pl
    --------------
    #!/usr/bin/perl
    # /dev/null so strace output is more readable
    open my $fh, ">/dev/null";
    select $fh;
    $| = 1;
    {
      print "enter";
      local *FH;
      open FH, $0;
      print "leave"
    }
    print "done";
  
  This simple script opens the I</dev/null> and tells Perl to send all
  the STDOUT there, which is also made unbuffered. Then the block is
  created in which the C<FH> file glob is localized. Then it's used to
  open the source code of the script (which resides in C<$0>). In order
  to separate event of entering the block scope and leaving it, the
  debug print statements are used. Now let's run the script under
  strace(1), which proves once again to be very useful in the tool bag
  of the mod_perl programmer:
  
    % strace /tmp/io.pl
    write(3, "enter", 5)                    = 5
    -> open("/tmp/io.pl", O_RDONLY) = 4
    fstat(4, {st_mode=S_ISGID|S_ISVTX|0401, st_size=0, ...}) = 0
    fcntl(4, F_SETFD, FD_CLOEXEC)           = 0
    write(3, "leave", 5)                    = 5
    -> close(4)                                = 0
    write(3, "done", 4)                     = 4
  
  So you can see that I</tmp/io.pl> is actually close()'d.
  
  Under Perl version 5.6 C<Symbol.pm>-like functionality is a built-in
  feature, so you can do:
  
    open my $fh, ">/tmp/foo" or die $!;
  
  and C<$fh> will be automatically vivified as a valid filehandle, so
  you don't need to use the C<Symbol> module anymore, if backward
  compatibility is not a requirement.
  
  You can also use the C<IO::*> modules, such as C<IO::File> or
  C<IO::Dir>.  These are much bigger than the C<Symbol> module, and
  worth using for files or directories only if you are already using
  them for the other features which they provide.  As a matter of fact,
  these modules use the C<Symbol> module themselves.  Here is an example
  of their usage:
  
    use IO::File;
    use IO::Dir;
    my $fh = IO::File->new(">filename");
    my $dh = IO::Dir->new("dirname");
  
  If you still have to use global filehandles, there are a few
  approaches we can take to solving the locking problem.
  
  If you are running under C<Apache::Registry> and friends, the C<END>
  block will perform the cleanup work for you.  You might use C<END> in
  the same way for scripts running under mod_cgi, or in plain Perl
  scripts.  Just add the cleanup code to this block and you are safe.
  
  For example if you work with dbm files just like with locking it's
  important to flush the dbm buffers, by calling a sync() method:
  
    END{
      # make sure that the DB is flushed
       $dbh->sync();
    }
  
  Normally the C<END> blocks will not be executed after the completion
  of a request, but only when an Apache child process exits, then if you
  are writing your own handlers you will need to use the
  register_cleanup() function to supply cleanup code similar to that
  used in C<END> blocks instead of using C<END> blocks.  
  
  Under mod_perl, the above will work only for C<Apache::Registry>
  scripts.  Otherwise execution of the C<END> block will be postponed
  until the process terminates.  If you write a handler in the Perl API
  use the C<register_cleanup()> method instead.  It accepts a reference
  to a subroutine as an argument:
  
    $r->register_cleanup(sub { $dbh->sync() });
  
  Even better would be to check whether the client connection has been
  aborted.  If you don't check, the cleanup code will always be executed
  and for normally terminated scripts this may not be what you want:
  
    $r->register_cleanup(
      # make sure that the DB is flushed
      sub{ 
        $dbh->sync() if Apache->request->connection->aborted();
      }
    );
  
  So in the case of C<END> block usage you would use:
  
    END{
      # make sure that the DB is flushed
      $dbh->sync() if Apache->request->connection->aborted();
    }
  
  Note that if you use C<register_cleanup()> it should be called at the
  beginning of the script, or as soon as the variables you want to use
  in this code become available.  If you use it at the end of the
  script, and the script happens to be aborted before this code is
  reached, there will be no cleanup performed.
  
  For example C<CGI.pm> registers the cleanup subroutine in its new()
  method:
  
    sub new {
      # code snipped
      if ($MOD_PERL) {   
          Apache->request->register_cleanup(\&CGI::_reset_globals);
          undef $NPH;
      }
      # more code snipped
    }
  
  There is another way to register a section of cleanup code for Perl
  API handlers.  You may use C<PerlCleanupHandler> in the configuration
  file, like this:
  
    <Location /foo>
      SetHandler perl-script
      PerlHandler        Apache::MyModule
      PerlCleanupHandler Apache::MyModule::cleanup()
      Options ExecCGI
    </Location>
  
  C<Apache::MyModule::cleanup()> performs the cleanup, obviously.
  
  
  
  
  
  =head1 Handling Server Timeout Cases and Working with $SIG{ALRM}
  
  A similar situation to L<Pressed Stop button
  disease|debug/Handling_the_User_pressed_Stop_> happens when the
  browser times out the connection (is it about 2 minutes?).  There are
  cases when your script is about to perform a very long operation and
  there is a chance that its duration will be longer than the client's
  timeout.  One example is database interaction, where the DB engine
  hangs or needs a long time to return the results.  If this is the
  case, use C<$SIG{ALRM}> to prevent the timeouts:
  
      $timeout = 10; # seconds
    eval {
      local $SIG{ALRM} =
          sub { die "Sorry timed out. Please try again\n" };
      alarm $timeout;
      ... db stuff ...
      alarm 0;
    };
    
    die $@ if $@;
  
  It was recently discovered that C<local $SIG{'ALRM'}> does not restore
  the original underlying C handler.  This was fixed in mod_perl 1.19_01
  (L<CVS version|download/mod_perl>).  As a matter of fact none of the
  C<local $SIG{FOO}> signals restores the original C handler - read
  L<Debugging Signal Handlers
  ($SIG{FOO})|debug/Debugging_Signal_Handlers_SIG_> for a debug
  technique and a possible workaround.
  
  
  
  
  
  
  
  
  =head1 Looking inside the server
  
  Your server is up and running, but something appears to be wrong.  You
  want to see the numbers to tune your code or server configuration.
  You just want to know what's really going on inside the server.
  
  How do you do it?
  
  There are a few tools that allow you to look inside the server. 
  
  =head2 Apache::Status -- Embedded Interpreter Status Information
  
  This is a very useful module.  It lets you watch what happens to the
  Perl parts of the server.  You can see the size of all subroutines and
  variables, variable dumps, lexical information, OPcode trees, and
  more.
  
  You shouldn't use it on production server as it adds quite a bit of
  overhead for each request.
  
  =head3 Minimal Configuration
  
  This configuration enables the C<Apache::Status> module with its
  minimum feature set.  Add this to I<httpd.conf>:
  
    <Location /perl-status>
      SetHandler perl-script
      PerlHandler Apache::Status
      order deny,allow
      #deny from all
      #allow from 
    </Location>
  
  If you are going to use C<Apache::Status> it's important to put it as
  the first module in the start-up file, or in I<httpd.conf>:
  
    # startup.pl
    use Apache::Status ();
    use Apache::Registry ();
    use Apache::DBI ();
  
  If you don't put C<Apache::Status> before C<Apache::DBI>, you won't
  get the C<Apache::DBI> menu entry in the status.  For more about
  C<Apache::DBI> see L<Persistent DB
  Connections|performance/Persistent_DB_Connections>.
  
  =head3 Extended Configuration
  
  There are several variables which you can use to modify the behaviour
  of C<Apache::Status>.
  
  =over
  
  =item * PerlSetVar StatusOptionsAll On
  
  This single directive will enable all of the options described below.
  
  =item * PerlSetVar StatusDumper On
  
  When you are browsing symbol tables, you can view the values of your
  arrays, hashes and scalars with C<Data::Dumper>.
  
  =item * PerlSetVar StatusPeek On
  
  With this option On and the C<Apache::Peek> module installed,
  functions and variables can be viewed in C<Devel::Peek> style.
  
  =item * PerlSetVar StatusLexInfo On
  
  With this option On and the C<B::LexInfo> module installed, subroutine
  lexical variable information can be viewed.
  
  =item * PerlSetVar StatusDeparse On
  
  With this option On and C<B::Deparse> version 0.59 or higher (included
  in Perl 5.005_59+), subroutines can be "deparsed".
  
  Options can be passed to C<B::Deparse::new> like so:
  
    PerlSetVar StatusDeparseOptions "-p -sC"
  
  See the C<B::Deparse> manpage for details.
  
  =item * PerlSetVar StatusTerse On
  
  With this option On, text-based op tree graphs of subroutines can be
  displayed, thanks to C<B::Terse>.
  
  =item * PerlSetVar StatusTerseSize On
  
  With this option On and the C<B::TerseSize> module installed,
  text-based op tree graphs of subroutines and their size can be
  displayed.  See the C<B::TerseSize> docs for more info.
  
  =item * PerlSetVar StatusTerseSizeMainSummary On
  
  With this option On and the C<B::TerseSize> module installed,
  "Memory Usage" will be added to the C<Apache::Status> main menu.  This
  option is disabled by default, as it can be rather cpu intensive to
  summarize memory usage for the entire server.  It is strongly
  suggested that this option only be used with a development server
  running in -X mode, as the results will be cached.
  
  Remember to preload C<B::TerseSize> with: 
  
    PerlModule B::Terse
  
  =item * PerlSetVar StatusGraph
  
  When C<StatusDumper> (see above) is enabled, another link I<"OP Tree
  Graph"> will be present with the dump if this configuration variable
  is set to On.
  
  This requires the B module (part of the Perl compiler kit) and the
  C<B::Graph> module version 0.03 or higher to be installed along with
  the `dot' program.  Dot is part of the graph visualization toolkit from
  AT&T: http://www.research.att.com/sw/tools/graphviz/.
  
  WARNING: Some graphs may produce very large images, and some graphs
  may produce no image if C<B::Graph>'s output is incorrect.
  
  =back
  
  There is more information about C<Apache::Status> in its manpage.
  
  
  =head3 Usage
  
  Assuming that your mod_perl server listens on port 81, fetch
  http://www.myserver.com:81/perl-status
  
    Embedded Perl version 5.00502 for Apache/1.3.2 (Unix) mod_perl/1.16 
    process 187138, running since Thu Nov 19 09:50:33 1998
  
  Below all the sections are links when you view them through I</perl-status>
  
    Signal Handlers
    Enabled mod_perl Hooks
    PerlRequire'd Files
    Environment
    Perl Section Configuration
    Loaded Modules
    Perl Configuration
    ISA Tree
    Inheritance Tree
    Compiled Registry Scripts
    Symbol Table Dump
  
  Let's follow, for example, C<PerlRequire>'d Files.  We see:
  
    PerlRequire                   Location
    /home/perl/apache-startup.pl  /home/perl/apache-startup.pl
  
  From some menus you can move deeper to peek into the internals of
  the server, to see the values of the global variables in the packages,
  to see the cached scripts and modules, and much more. Just click around...
  
  =head3 Compiled Registry Scripts section seems to be empty.
  
  Sometimes when you fetch I</perl-status> and look at the B<Compiled
  Registry Scripts> you see no listing of scripts at all.  This is
  correct: C<Apache::Status> shows the registry scripts compiled in the
  httpd child which is serving your request for I</perl-status>.  If the
  child has not yet compiled the script you are asking for,
  I</perl-status> will just show you the main menu.
  
  
  =head2 mod_status
  
  The Status module allows a server administrator to find out how well
  the server is performing.  An HTML page is presented that gives the
  current server statistics in an easily readable form.  If required,
  given a compatible browser this page can be automatically refreshed.
  Another page gives a simple machine-readable list of the current
  server state.
  
  This Apache module is written in C.  It is compiled by default, so all
  you have to do to use it is enable it in your configuration file:
  
    <Location /status>
      SetHandler server-status
    </Location>
  
  For security reasons you will probably want to limit access to it.  If
  you have installed Apache according to the instructions you will find
  a prepared configuration section in I<httpd.conf>: to enable use of
  the mod_status module, just uncomment it.
  
    ExtendedStatus On
    <Location /status>
      SetHandler server-status
      order deny,allow
      deny from all
      allow from localhost
    </Location>
  
  You can now access server statistics by using a Web browser to access
  the page http://localhost/status (as long as your server recognizes
  localhost:).
  
  
  The details given by mod_status are:
  
  =over
  
  =item * The number of children serving requests
  
  =item * The number of idle children
  
  =item * The status of each child, the number of requests that child
  has performed and the total number of bytes served by the child
  
  =item * A total number of accesses and the total bytes served
  
  =item * The time the server was last started/restarted and how long it has
  been running for
  
  =item * Averages giving the number of requests per second, the number
  of bytes served per second and the average number of bytes per request
  
  =item * The current percentage CPU used by each child and in total by
  Apache
  
  =item * The current hosts and requests being processed
  
  =back
  
  
  
  =head2 Apache::VMonitor -- Visual System and Apache Server Monitor
  
  This module is covered in the section "L<Apache::* 
  Modules|modules/Apache_VMonitor_Visual_Syste>"
  
  
  
  
  
  
  
  
  =head1 Sometimes My Script Works, Sometimes It Does Not
  
  See L<Sometimes it Works Sometimes it does
  Not|porting/Sometimes_it_Works_Sometimes_it>
  
  
  
  
  
  
  
  =head1 Code Debug
  
  When the code doesn't perform as expected, either never or just
  sometimes, we say that the code needs debugging.  There are several
  levels of debugging complexity.
  
  The basic level is when Perl terminates the program during the
  compilation phase, before it tries to run the resulting byte-code.
  This usually happens because there are syntax errors in the code, or
  perhaps a module is missing.  Sometimes it takes quite an effort to
  solve these problems, since code that uses Apache CORE modules
  generally won't compile when executed from the shell.  We will learn
  how to solve syntax problems in mod_perl code quite easily.
  
  Once the program compiles and begins to run, there might be logical
  problems, when the program doesn't do what you thought you had
  programmed it to do.  These are somewhat harder to solve, especially
  when there is a lot of code to be inspected and reviewed, but it's
  just a matter of time.  Perl can help a lot, for example to locate
  typos, when we enable warnings.  For example, if you wanted to compare
  two numbers, but you omitted the second '=' character so that you had
  something like C<if $yes = 1> instead of C<if $yes == 1>, it warns us
  about the missing '='.
  
  The next level is when the program does what it's expected to do most of
  the time, but occasionally misbehaves.  Often you find that
  print() statements or the Perl debugger can help, but inspection of
  the code generally doesn't.  Often it's quite easy to debug with
  print(), but sometimes typing the debug messages can become very tedious.
  That's where the Perl debugger comes into its own.
  
  While print() statements always work, running the perl debugger for
  CGI scripts might be quite a challenge.  But with the right knowledge
  and tools handy the debug process becomes much easier.  Unfortunately
  there is no one easy way to debug your programs, as the debugging
  depends entirely on your code.  It can be a nightmare to debug really
  complex code, but as your style matures you can learn ways to write
  simpler code that is easier to debug.  You will probably find that
  when you write simpler clearer code it does not need so much debugging
  in the first place.
  
  One of the most difficult cases to debug, is when the process just
  terminates in the middle of processing a request and dumps core.
  Often when there is a bug the program tries to access a memory area
  that doesn't belong to it.  The operating system halts the process,
  tidies up and dumps core (it creates a file called I<core> in the
  current directory of the process that was running).  This is something
  that you rarely see with plain perl scripts, but it can easily
  happen if you use modules written in I<C> or I<C++> and something goes
  wrong with them.  Occasionally you will come across a bug in mod_perl
  itself (mod_perl is written in C), that was in a deep slumber before
  your code awakened it.
  
  
  In the following sections we will go through in detail each of the
  problems presented, thoroughly discuss them and present a few
  techniques to solve them.
  
  
  
  
  =head2 Locating and correcting Syntax Errors
  
  While developing code we often make syntax mistakes, like forgetting
  to put a comma in a list, or a semicolon at the end of a statement.
  
  Even at the end of a {} block, where a semicolon is not required at
  the end of the last statement, it may be better to put one in: there
  is a chance that you will add more code later, and when you do you
  might forget to add the now required semicolon.  Similarly, more items
  might be added later to a list; unlike many other languages, Perl has
  no problem when you end a list with a redundant comma.
  
  One approach to locating syntactically incorrect code is to execute
  the script from the shell with the I<-c> flag.  This tells Perl to
  check the syntax but not to run the code (actually, it will execute
  C<BEGIN>, C<END> blocks, and I<use()> calls, because these are
  considered as occurring outside the execution of your program). (Note
  also that Perl 5.6.0 has introduced a new special variable, C<$^C>,
  which is set to true when perl is run with the I<-c> flag; this
  provides an opportunity to have some further control over C<BEGIN> and
  C<END> blocks during syntax checking.)  Also it's a good idea to add
  the C<-w> switch to enable warnings:
  
    perl -cw test.pl
  
  If there are errors in the code, Perl will report the errors, and tell
  you at which line numbers in your script the errors were found.
  
  The next step is to execute the script, since in addition to syntax
  errors there may be run time errors.  These are the errors that cause
  the I<"Internal Server Error"> page when executed from a browser.
  With plain CGI scripts it's the same as running plain Perl scripts --
  just execute them and see that they work.
  
  The whole thing is quite different with scripts that use C<Apache::*>
  modules which can be used only from within the mod_perl server
  environment.  These scripts rely on other code, and an environment
  which isn't available when you attempt to execute the script from the
  shell.  There is no Apache request object available to the code when
  it is executed from the shell.
  
  If you have a problem when using C<Apache::*> modules, you can make a
  request to the script from a browser and watch the errors and warnings
  as they are logged to the I<error_log> file.  Alternatively you can
  use the C<Apache::FakeRequest> module.
  
  =head2 Using Apache::FakeRequest to Debug Apache Perl Modules
  
  C<Apache::FakeRequest> is used to set up an empty Apache request
  object that can be used for debugging.  The C<Apache::FakeRequest>
  methods just set internal variables with the same names as the methods and
  return the value of the internal variables.  Initial values for
  methods can be specified when the object is created.  The print method
  prints to STDOUT.
  
  Subroutines for Apache constants are also defined so that you can use
  C<Apache::Constants> while debugging, although the values of the
  constants are hard-coded rather than extracted from the Apache source
  code.
  
  Let's write a very simple module, which prints I<"OK"> to the client's
  browser:
  
    package Apache::Example;
    use Apache::Constants;
    
    sub handler{
      my $r = shift;
      $r->send_http_header('text/plain');
      print "You are OK ", $r->get_remote_host, "\n";
      return OK;
    }
    
    1;
  
  You cannot debug this module unless you configure the server to run
  it, by calling its handler from somewhere.  So for example you could
  put in I<httpd.conf>:
  
    <Location /ex>
      SetHandler perl-script
      PerlHandler Apache::Example
    </Location>
  
  Then after restarting the server you could start a browser, request
  the location http://localhost/ex and examine the output.  Tedious, no?
  
  But with the help of C<Apache::FakeRequest> you can write a little
  script that will emulate a request and return the output.
  
    #!/usr/bin/perl
  
    use Apache::FakeRequest ();
    use Apache::Example ();
    
    my $r = Apache::FakeRequest->new('get_remote_host'=>'www.foo.com');
    Apache::Example::handler($r);
  
  when you execute the script from the command line, you will see the
  following output:
  
    You are OK www.foo.com
  
  
  
  
  
  
  
  =head2 Finding the Line Which Triggered the Error or Warning
  
  Perl has no problem with the line numbers and file names for modules
  that are read from disk in the normal way, but modules that are
  compiled via eval() such as C<Apache::Registry> and C<Apache::PerlRun>
  sometimes with some versions of Perl get confused.
  
  There is the Perl E<lt>E<lt>HEREDOC inside eval "" problem that
  confuses the Perl current linenumber counter, newer Perls fix this.
  For older Perls compiling with the experimental B<PERL_MARK_WHERE=1>
  should solve this.
  
  There are compiler directives to reset its counter to some value that
  you decide.  You can always pepper your code with these to help you
  locate the problem.  At the beginning of the line you could write
  something of the form:
  
    #line nnn label
  
  For example:
  
    #line 298 myscript.pl
    or 
    #line 890 some_label_to_be_used_in_the_error_message
  
  The '#' must be in the first column, so if you cut and paste from this text
  you must remember to remove any leading white space.
  
  The label is optional - the filename of the script will be used by
  default.  This directive sets the line number of the B<following>
  line, not the line the directive is on.  You can use a little script
  to stuff every N lines of your code with these directives, but then
  you will have to remember to rerun this script every time you add or
  remove code lines.  The script:
  
    #!/usr/bin/perl
    # Puts Perl line markers in a Perl program for debugging purposes.  
    # Also takes out old line markers.
    die "No filename to process.\n" unless @ARGV;
    my $filename = shift;
    my $lines = 100;
    open IN, $filename or die "Cannot open file: $filename: $!\n";
    open OUT, ">$filename.marked"
        or die "Cannot open file: $filename.marked: $!\n";
    my $counter = 1;
    while (<IN>) {
      print OUT "#line $counter\n" unless $counter++ % $lines;
      next if /^#line /;
      print OUT $_;
    }
    close OUT;
    close IN;
    chmod 0755, "$filename.marked";
  
  Another way of narrowing down the area to be searched is to move
  most of the code into a separate modules.  This ensures that the line
  number will be reported correctly.
  
  To have a complete trace of calls add:
  
    use Carp ();
    local $SIG{__WARN__} = \&Carp::cluck;
  
  
  =head2 Using print() for Debugging
  
  The universal debugging tool across nearly all platforms and
  programming languages is printf() or the equivalent output function.
  This can send data to the console, a file, an application window and
  so on.  In perl we generally use the print() function.  With an idea
  of where and when the bug is triggered, a developer can insert print()
  statements in the source code to examine the value of data at certain
  stages of execution.
  
  However, it is rather difficult to anticipate all possible directions
  a program might take and what data to suspect of causing trouble.  In
  addition, inline debugging code tends to add bloat and degrade the
  performance of an application and can also make the code harder to
  read and maintain.  And you have to comment out or remove the
  debugging print() calls when you think that you have solved the
  problem.  But if later you discover that you need to debug the same
  code again, you need at best to uncomment the debugging code lines or,
  at worst, to write them again from scratch.
  
  Let's see a few examples where we use print() to debug some problem.
  In one of my applications I wrote a function that returns the date
  that was one week ago.  Here it is:
  
    print "Content-type: text/plain\r\n\r\n";
    
    print "A week ago the date was ",date_a_week_ago(),"\n";
    
    # return a date one week ago as a string in format: MM/DD/YYYY
    ####################
    sub date_a_week_ago{
    
      my @month_len   = (31,28,31,30,31,30,31,31,30,31,30,31);
    
      my ($day,$month,$year) = (localtime)[3..5];
      for (my $j = 0; $j < 7; $j++) {
    
        $day--;
        if ($day == 0) {
    
          $month--;
          if ($month == 0) {
            $year--;
            $month = 12;
          }
    
            # there are 29 days in February in a leap year
          $month_len[1] =  
            (($year % 4 or $year % 100 == 0) and $year % 400 )
        ? 28 : 29;
    
            # set $day to be the last day of the previous month 
          $day = $month_len[$month - 1]; 
    
        }   # end of if ($day == 0)
      }     # end of for ($i = 0;$i < 7;$i++)
    
      return sprintf "%02d/%02d/%04d",$month,$day,$year+1900;
    }
  
  This code is pretty straightforward.  We get today's date and subtract
  one from the value of the day we get, updating the month and the year
  on the way if boundaries are being crossed (end of month, end of
  year).  If we do it seven times in loop then at the end we should get
  a date that was a week ago.
  
  Note that since locatime() returns the year as a value of
  C<current_four_digits_format_year-1900> (which means that we don't
  have a century boundary to worry about) then if we are in the middle
  of the first week of the year 2000, the value of year returned by
  localtime() will be C<100> and not C<0> as you might mistakenly
  assume.  So when the code does C<$year--> it becomes C<99> and not
  C<-1>.  At the end we add 1900 to get back the correct four-digit year
  format. (This is all correct as long as you don't go to the years
  prior to 1900)
  
  Also note that we have to account for leap years where there are 29
  days in February.  For the other months we have prepared an array
  containing the month lengths.
  
  Now when we run this code and check the result, we see that something
  is wrong.  For example, if today is C<10/23/1999> we expect the above
  code to print C<10/16/1999>.  In fact it prints C<09/16/1999>, which
  means that we have lost a month.  The above code is buggy!
  
  Let's put a few debug print() statements in the code, near the
  C<$month> variable:
  
    sub date_a_week_ago{
    
      my @month_len   = (31,28,31,30,31,30,31,31,30,31,30,31);
    
      my ($day,$month,$year) = (localtime)[3..5];
      print "[set] month : $month\n"; # DEBUG
      for (my $j = 0; $j < 7; $j++) {
    
        $day--;
        if ($day == 0) {
    
          $month--;
          if ($month == 0) {
            $year--;
            $month = 12;
          }
          print "[loop $i] month : $month\n"; # DEBUG
    
            # there are 29 days in February in a leap year
          $month_len[1] =  
            (($year % 4 or $year % 100 == 0) and $year % 400 )
        ? 28 : 29;
    
            # set $day to be the last day of the previous month 
          $day = $month_len[$month - 1]; 
    
        }   # end of if ($day == 0)
      }     # end of for ($i = 0;$i < 7;$i++)
    
      return sprintf "%02d/%02d/%04d",$month,$day,$year+1900;
    }
  
  When we run it we see:
  
    [set] month : 9
  
  It is supposed to be the number of the current month (C<10>), but actually
  it is not.  We have spotted a bug, since the only code that sets the
  C<$month> variable consists of a call to localtime().  So did we find
  a bug in Perl?  let's look at the manpage of the localtime() function:
  
    % perldoc -f localtime
    
    Converts a time as returned by the time function to a 9-element
    array with the time analyzed for the local time zone.  Typically
    used as follows:
  
      #  0    1    2     3     4    5     6     7     8
      ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
                                                  localtime(time);
  
    All array elements are numeric, and come straight out of a struct
    tm.  In particular this means that C<$mon> has the range C<0..11>
    and C<$wday> has the range C<0..6> with Sunday as day C<0>.  Also,
    C<$year> is the number of years since 1900, that is, C<$year> is
    C<123> in year 2023, and I<not> simply the last two digits of the
    year.  If you assume it is, then you create non-Y2K-compliant
    programs--and you wouldn't want to do that, would you?
    [more info snipped]
  
  Which reveals to us that if we want to count months from 1 to 12 and
  not 0 to 11 we are supposed to increment the value of C<$month>.  Among
  other interesting facts about locatime() we also see an explanation of
  C<$year>, which as I've mentioned before is set to the number of years
  since 1900.
  
  We have found the bug in our code and learned new things about
  localtime().  To correct the above code we just increment the month
  after we call localtime():
  
      my ($day,$month,$year) = (localtime)[3..5];
      $month++;
  
  
  
  =head2 Using print() and Data::Dumper for Debugging
  
  Sometimes you need to peek into complex data structures, and trying to
  print them out can be tricky.  That's where C<Data::Dumper> comes to 
  our rescue.  For example if we create this complex data structure:
  
    $data =
      {
       array => [qw(a b c d)],
       hash  => {
               foo => "oof",
               bar => "rab",
              },
      };
  
  How do we print it out?  Very easily:
  
    use Data::Dumper;
    print Dumper \$data;
  
  What we get is a pretty-printed C<$data>:
  
    $VAR1 = \{
              'hash' => {
                          'foo' => 'oof',
                          'bar' => 'rab'
                        },
              'array' => [
                          'a',
                          'b',
                          'c',
                          'd'
                         ]
            };
  
  
  While writing this example I made a mistake and wrote C<qw(a b c d)>
  instead of C<[qw(a b c d)]>.  When I pretty-printed the contents of
  C<$data> I immediately saw my mistake:
  
    $VAR1 = \{
              'b' => 'c',
              'd' => 'hash',
              'HASH(0x80cd79c)' => undef,
              'array' => 'a'
            };
  
  That's not what I wanted of course, but I spotted the bug and
  corrected it, as you saw in the original example from above.
  
  Of course you can use 
  
    print STDERR $variable;
  
  or:
  
    warn $variable; 
  
  instead of print to have all the debug messages in the error_log,
  which makes it even easier to debug your code.
  
  
  =head2 The Importance of a Good Concise Coding Style
  
  Don't strive for elegant, clever code.  Try to develop a good coding
  style by writing code which is concise yet easy to understand.  It's
  much easier to find bugs in concise, simple code.  And such code tends
  to have less bugs.
  
  The I<'one week ago'> example from the previous section is not
  concise.  There is a lot of redundancy in it, and as a result it is harder
  to debug than it needs to be.  Here is a condensed version of the main
  loop.  As you can see, this version won't make it easier to understand
  the code:
  
    for (0..6) {
      next if --$day;
      $year--,$month=12 unless --$month;
      $day = $month != 1 
          ? $month_len[$month-1] 
          : (($year % 4 or $year % 100 == 0) and $year % 400 )
               ? 28
               : 29;
    }
  
  Don't do that at home :) 
  
  Why did I present this version?  Because it is too obscure, which
  makes it difficult to understand and maintain.  On the other hand a
  part of this code is easier to understand.
  
  Larry Wall, the author of Perl, is a linguist.  He tried to define the
  syntax of Perl in a way that makes working in Perl much like
  working in English.  So it can be a good idea to learn Perl coding
  idioms,
  some of which might seem odd at first but once you get used to them,
  you will find it difficult to understand how you could have lived
  without them before.  I'll show just a few of the most common Perl
  coding idioms.
  
  It's a good idea to write code which is more readable but which avoids
  redundancy, so it's better to write:
  
    unless ($i) {...}
  
  rather than:
  
    if ($i == 0) {...}
  
  if you want to test for trueness only.
  
  Use a much more concise, Perlish style:
  
    for my $j (0..6) {...}
  
  instead of the syntax used in some other languages:
  
    for (my $j=0; $j<=6; $j++) {...}
  
  It's much simpler to write and comprehend code like this:
  
    print "something" if $debug;
  
  than this:
  
    if($debug){
      print "something";
    }
  
  A good style that improves understanding, readability and reduces the
  chances of having a bug is shown below in the form of yet another
  rewrite of our I<`one week ago'> code:
  
    for (0..6) {
      $day--;
      next if $day;
    
      $month--;
      unless ($month){
        $year--;
        $month=12
      }
    
      if($month == 1){
        $day = (($year % 4 or $year % 100 == 0) and $year % 400 ) 
             ? 28 : 29;
      } else {
        $day = $month_len[$month-1];
      }
    } 
  
  which is a happy medium between the excessively verbose style of the
  first version and very obscure second version.
  
  And of course a two liner, which is much faster and easier to
  understand is:
  
    sub date_a_week_ago{
      my ($day,$month,$year) = (localtime(time-604800))[3..5];
      return sprintf "%02d/%02d/%04d",$month+1,$day,$year+1900;
    }
  
  Just take the current date in seconds since epoch as time() returns,
  subtract a week in seconds (7*24*60*60 = 604800) and feed the result
  to localtime() - voila we've got the date of one week ago!
  
  Why is the last version important, when the first one works just fine?
  Not because of performance issues (although this last one is twice
  as fast as the first), but because there are more ways to put a bug in
  the first version than there are in the last one.
  
  
  =head2 Introduction to the Perl Debugger
  
  As we saw earlier, it's I<almost> always possible to debug code with
  the help of print().  However, it is impossible to anticipate all the
  possible directions which a program might take, and difficult to know
  what code to suspect when trouble occurs.  In addition, inline
  debugging code tends to add bloat and degrade the performance of an
  application, although most applications offer inline debugging as a
  compile time option to avoid these hits.  In any case, this
  information tends to only be useful to the programmer who added the
  print statements in the first place.
  
  Sometimes you have to debug tens of thousands lines of Perl in an
  application, and while you may be a very experienced Perl programmer
  who can understand Perl code quite well by just looking at it, no mere
  mortal can even begin to understand what will actually happen in such
  a large application, until the code is running.  So you just don't
  know where to start adding your trusty print() statements to see what
  is happening inside.
  
  The most effective way to track down a bug is to run the program inside
  an interactive debugger.  The majority of programming languages have
  such a tool available, allowing one to see what is happening inside an
  application while it is running.  The basic features of an interactive
  debugger allow you to:
  
  =over
  
  =item *
  
  Stop at a certain point in the code, based on a routine name or source
  file and line number
  
  =item *
  
  Stop at a certain point in the code, based on conditions such as the
  value of a given variable
  
  =item *
  
  Perform an action without stopping, based on the criteria above
  
  =item *
  
  View and modify the value of variables at any given point
  
  =item *
  
  Provide context information such as stack traces and source windows
  
  =back
  
  It does take practice to learn the most effective ways of using an
  interactive debugger, but the time and effort will be paid back
  many-fold in the long run.
  
  Most C and C++ programmers are familiar with the interactive GNU
  debugger (C<gdb>).  C<gdb> is a stand-alone program that requires your
  code to be compiled with debugging symbols to be useful.  While C<gdb>
  can be used to debug the Perl interpreter itself, it cannot be used to
  debug your Perl scripts.
  
  Not to worry, Perl provides its own interactive debugger, called
  C<perldb>.  Giving control of your Perl program to the interactive
  debugger is simply a matter of specifying the C<-d> command line
  switch.  When this switch is used, Perl inserts debugging hooks into
  the program syntax tree, but it leaves the job of debugging to a Perl
  module separate from the perl binary itself.
  
  I will start by introducing a few of the basic concepts and commands
  of the Perl interactive debugger.  These warm-up examples all run from
  the command line, independent of mod_perl, but are all still
  relevant when we do finally go inside Apache.
  
  It might be useful to keep the I<perldebug> manpage handy for reference
  while reading this section, and for future debugging sessions on your
  own.
  
  The interactive debugger will attach to the current terminal and
  present you with a prompt just before the first program statement is
  executed. For example:
  
    % perl -d -le 'print "mod_perl rules the world"'
    
    Loading DB routines from perl5db.pl version 1.0402
    
    Emacs support available.
    
    Enter h or `h h' for help.
    
    main::(-e:1):   print "mod_perl rules the world"
      DB<1>
  
  The source line shown is the line which Perl is I<about> to execute,
  the C<next> command (or just C<n>) will cause this line to be executed
  after which execution will stop again just before the next line:
  
    main::(-e:1):   print "mod_perl rules the world"
      DB<1> n
    mod_perl rules the world
    Debugged program terminated.  Use q to quit or R to restart,
    use O inhibit_exit to avoid stopping after program termination,
    h q, h R or h O to get additional info.
    DB<1>
  
  In this case, our example code is only one line long, so we have
  finished interacting after the first line of code is executed. Let's
  try again with slightly longer example which is the following script:
  
    my $word = 'mod_perl';
    my @array = qw(rules the world);
    
    print "$word @array\n";
  
  Save the script in a file called I<domination.pl> and run with the
  C<-d> switch:
  
    % perl -d domination.pl
    
    main::(domination.pl:1):      my $word = 'mod_perl';
      DB<1> n
    main::(domination.pl:2):      my @array = qw(rules the world);
      DB<1>
  
  At this point, the first line of code has been executed and the
  variable C<$word> has been assigned the value I<mod_perl>.  We can
  check this by using the C<p> command (an abbreviation for the C<print>
  command, the two are interchangeable):
  
    main::(domination.pl:2):      my @array = qw(rules the world);
      DB<1> p $word
    mod_perl
  
  The C<print> command works just like the Perl's built-in print()
  function, but adds a trailing newline and outputs to the C<$DB::OUT>
  file handle, which is normally opened on the terminal where Perl was
  launched from.  Let's carry on:
  
      DB<2> n
    main::(domination.pl:4):      print "$word @array\n";
      DB<2> p @array
    rulestheworld
      DB<3> n
    mod_perl rules the world
    Debugged program terminated.  Use q to quit or R to restart,
    use O inhibit_exit to avoid stopping after program termination,
    h q, h R or h O to get additional info.  
  
  Ouch, C<p @array> printed C<rulestheworld> and not C<rules the world>,
  as you might expect it to, but that's absolutely correct.  If you print
  an array without expanding it first into a string it will be printed
  without adding the content of the C<$"> variable (otherwise known as
  C<$LIST_SEPARATOR> if the C<English> pragma is being used) between the
  elements of the array.
  
  If you type:
  
    print "@array";
  
  the output will be C<rules the world> since the default value of the
  C<$"> variable is a single space.
  
  You should have noticed by now that there is some valuable information
  to the left of each executable statement:
  
    main::(domination.pl:4):      print "$word @array\n";
      DB<2>
  
  First is the current package name, in this case C<main::>.  Next is
  the current filename and statement line number, I<domination.pl> and 4
  in the example above.  The number presented at the prompt is the
  command number which can be used to recall commands from the session
  history, using the C<!> command followed by this number.  For example,
  C<!1> would repeat the first command:
  
    % perl -d -e0
    
    main::(-e:1):   0
      DB<1> p $]
    5.00503
      DB<2> !1
    p $]5.00503
      DB<3> 
  
  Where C<$]> is the perl's version number. As you see C<!1> prints the
  value of C<$]>, preceded by the command that was executed.
  
  Things start to get more interesting as the code does.  In the example
  script below (save it to a file called I<test.pl>) we've increased the
  number of source files and packages by including the standard
  C<Symbol> module, along with an invocation of its gensym() function:
  
    use Symbol ();
    
    my $sym = Symbol::gensym();
    
    print "$sym\n";
  
  
    % perl -d test.pl 
    
    main::(test.pl:3):      my $sym = Symbol::gensym();
      DB<1> n
    main::(test.pl:5):      print "$sym\n";
      DB<1> n
    GLOB(0x80c7a44)
  
  First, notice the debugger did not stop at the first line of the file.
  This is because C<use ...> is a compile-time statement, not a run-time
  statement.  Also notice there was more work going on than the debugger
  revealed.  That's because the C<next> command does not enter
  subroutine calls.  To step into a subroutine code use the C<step>
  command (or its abbreviated form C<s>):
  
    % perl -d test.pl
    
    main::(test.pl:3):      my $sym = Symbol::gensym();
      DB<1> s
    Symbol::gensym(/usr/lib/perl5/5.00503/Symbol.pm:86):
    86:         my $name = "GEN" . $genseq++;
      DB<1> 
  
  Notice the source line information has changed to the
  C<Symbol::gensym> package and the C<Symbol.pm> file.  We can carry on
  by hitting the return key at each prompt, which causes the debugger to
  repeat the last C<step> or C<next> command.  It won't repeat a
  C<print> command though.  The debugger will eventually return from the
  subroutine back to our main program:
  
      DB<1> 
    Symbol::gensym(/usr/lib/perl5/5.00503/Symbol.pm:87):
    87:         my $ref = \*{$genpkg . $name};
      DB<1> 
    Symbol::gensym(/usr/lib/perl5/5.00503/Symbol.pm:88):
    88:         delete $$genpkg{$name};
      DB<1> 
    Symbol::gensym(/usr/lib/perl5/5.00503/Symbol.pm:89):
    89:         $ref;
      DB<1> 
    main::(test.pl:5):      print "$sym\n";
      DB<1> 
    GLOB(0x80c7a44)
  
  Our line-by-line debugging approach has served us well for this small
  program, but imagine the time it would take to step through a large
  application at the same pace.  There are several ways to speed up a
  debugging session, one of which is known as I<setting a breakpoint>.
  The C<breakpoint> command (C<b>) can be used for instructing the
  debugger to stop at a named subroutine or at any line of any file.  In
  this example session, at the first debugger prompt we will set a
  breakpoint at the C<Symbol::gensym> subroutine, telling the debugger
  to stop at the first line of this routine when it is called.  Rather
  than move along with C<next> or C<step> we give the C<continue>
  command (C<c>) which tells the debugger to execute the script without
  stopping until it reaches a breakpoint:
  
    % perl -d test.pl
    
    main::(test.pl:3):      my $sym = Symbol::gensym();
      DB<1> b Symbol::gensym
      DB<2> c
    Symbol::gensym(/usr/lib/perl5/5.00503/Symbol.pm:86):
    86:         my $name = "GEN" . $genseq++;
  
  Now let's pretend we are debugging a large application where
  C<Symbol::gensym> might be called in various places. When the
  subroutine breakpoint is reached, by default the debugger does not
  reveal where it was called from.  One way to find out this information
  is with the C<Trace> command (C<T>):
  
      DB<2> T
    $ = Symbol::gensym() called from file `test.pl' line 3
  
  In this example, the call stack is only one level deep, so only that
  line is printed.  We'll look at an example with a deeper stack
  later. The left-most character reveals the context in which the
  subroutine was called.  C<$> represents scalar context, in other
  examples you may see C<@> which represents list context or C<.> which
  represents void context.  In our case we have called:
  
    my $sym = Symbol::gensym();
  
  which calls the C<Symbol::gensym()> in scalar context.
  
  Below we've made our I<test.pl> example a little more complex.  First,
  we've added a C<My::World> package declaration at the top of the
  script, so we are no longer working in the C<main::> package.  Next,
  we've added a subroutine named do_work() which invokes the familiar
  C<Symbol::gensym>, along with another function called
  C<Symbol::qualify> and then returns a hash reference of the results.  The
  do_work() routine is invoked inside a I<for> loop which will be run
  twice:
  
    package My::World;
    
    use Symbol ();
    
    for (1,2) {
      do_work("now");
    }
    
    sub do_work {
      my($var) = @_;
    
      return undef unless $var;
    
      my $sym  = Symbol::gensym();
      my $qvar = Symbol::qualify($var);
    
      my $retval = {
                   'sym' => $sym,
                   'var' => $qvar,
                   };
    
      return $retval;
    }
  
  
  We'll start by setting a few breakpoints and then we use the C<List>
  command (C<L>) to display them:
  
    % perl -d test.pl
    
    My::World::(test.pl:5):   for (1,2) {
      DB<1> b Symbol::qualify
      DB<2> b Symbol::gensym
      DB<3> L
    /usr/lib/perl5/5.00503/Symbol.pm:
     86:        my $name = "GEN" . $genseq++;
       break if (1)
     95:        my ($name) = @_;
       break if (1)
  
  The filename and line number of the breakpoint are displayed just
  before the source line itself.  Because both breakpoints are located in
  the same file, the filename is displayed only once.  After the source
  line we see the condition on which to stop.  In this case, as the
  constant value 1 indicates, we will always stop at these breakpoints.
  Later on you'll see how to specify a condition.
  
  As we will see, when the C<continue> command is executed, the
  execution of the program stops at one of these breakpoints, either on
  line 86 or 95 of the C</usr/lib/perl5/5.00503/Symbol.pm> file,
  whichever is reached first.  The displayed code lines are the first
  rows of the two subroutines from C<Symbol.pm>.  Breakpoints may only
  be applied to lines of run-time executable code, you cannot put
  breakpoints on empty lines or comments for example.
  
  In our example the C<List> command shows which lines the breakpoints
  were set on, but we cannot tell which breakpoint belongs to which
  subroutine.  There are two ways to find this out.  One is to run the
  C<continue> command and when it stops, execute the C<Trace> command we
  saw before:
  
      DB<3> c
    Symbol::gensym(/usr/lib/perl5/5.00503/Symbol.pm:86):
    86:         my $name = "GEN" . $genseq++;
      DB<3> T
    $ = Symbol::gensym() called from file `test.pl' line 14
    . = My::World::do_work('now') called from file `test.pl' line 6
  
  So we see that it was C<Symbol::gensym>.  The other way is to ask for
  a listing of a range of lines from the code.  For example, let's check
  which subroutine line 86 is a part of.  We use the C<list>
  (lowercase!) command (C<l>), which displays parts of the code.  The
  C<list> command accepts various arguments, the one that we want to use
  here is a range of lines.  Since the breakpoint is at line 86, let's
  print a few lines above and below that line:
  
      DB<3> l 85-87
    85      sub gensym () {
    86==>b      my $name = "GEN" . $genseq++;
    87:         my $ref = \*{$genpkg . $name};
  
  Now we know it's the C<gensym> sub and we also see the breakpoint
  displayed with the help of the C<==E<gt>b> markup.  We could also use
  the name of the sub to display its code:
  
      DB<4> l Symbol::gensym
    85      sub gensym () {
    86==>b      my $name = "GEN" . $genseq++;
    87:         my $ref = \*{$genpkg . $name};
    88:         delete $$genpkg{$name};
    89:         $ref;
    90      }
  
  The C<delete> command (C<d>) is used to remove a breakpoint by
  specifying the line number of the breakpoint.  Let's remove the first
  one:
  
      DB<5> d 95
  
  The C<Delete> command (with a capital `D') or C<D> removes all
  currently installed breakpoints.
  
  Now let's look again at the trace produced at the breakpoint:
  
      DB<3> c
    Symbol::gensym(/usr/lib/perl5/5.00503/Symbol.pm:86):
    86:         my $name = "GEN" . $genseq++;
      DB<3> T
    $ = Symbol::gensym() called from file `test.pl' line 14
    . = My::World::do_work('now') called from file `test.pl' line 6
  
  As you can see, the stack trace prints the values which are passed
  into the subroutine.  Ah, and perhaps we've found our first bug, as we
  can see do_work() was called in void context, so the return value was
  lost into thin air.  Let's change the I<'for'> loop to check the
  return value of do_work():
  
    for (1,2) {
      my $stuff = do_work("now");
      if ($stuff) {
          print "work is done\n";
      }
    }
  
  In this session we will set a breakpoint at line 7 of C<test.pl> where
  we check the return value of do_work():
  
    % perl -d test.pl
    
    My::World::(test.pl:5):   for (1,2) {
      DB<1> b 7
      DB<2> c
    My::World::(test.pl:7):     if ($stuff) {
      DB<2>
  
  Our program is still small, but already it is getting more difficult
  to understand the context of just one line of code.  The C<window>
  command (C<w>) will list a few lines of code that surround the current
  line:
  
      DB<2> w
    4         
    5:        for (1,2) {
    6:          my $stuff = do_work("now");
    7==>b       if ($stuff) {
    8:              print "work is done\n";
    9           }
    10        }
    11        
    12        sub do_work {
    13:         my($var) = @_;
  
  The arrow points to the line which is about to be executed and also
  contains a C<'b'> indicating that we have set a breakpoint at this
  line.  The breakable lines of code include a C<`:'> immediately after
  the line number.
  
  Now, let's take a look at the value of the C<$stuff> variable with the
  trusty old C<print> command:
  
      DB<2> p $stuff
    HASH(0x82b89b4)
  
  That's not very useful information.  Remember, the C<print> command
  works just like the built-in print() function does.  The debugger's
  C<x> command evaluates a given expression and prints the results in a
  "pretty" fashion:
  
      DB<3> x $stuff
    0  HASH(0x82b89b4)
       'sym' => GLOB(0x826a944)
          -> *Symbol::GEN0
       'var' => 'My::World::now'
  
  There, things seem to be okay, let's double check by calling do_work()
  with a different value and print the results:
  
      DB<4> x do_work('later')
    0  HASH(0x82bacc8)
       'sym' => GLOB(0x818f16c)
          -> *Symbol::GEN1
       'var' => 'My::World::later'
  
  We can see the symbol was incremented from C<GEN0> to C<GEN1> and the
  variable later was qualified, as expected.
  
  Now let's change the test program a little to iterate over a list of
  arguments held in C<@args> and print a slightly different message:
  
    package My::World;
    
    use Symbol ();
    
    my @args = qw(now later);
    for my $arg (@args) {
      my $stuff = do_work($arg);
      if ($stuff) {
          print "do your work $arg\n";
      }
    }
    
    sub do_work {
      my($var) = @_;
    
      return undef unless $var;
    
      my $sym = Symbol::gensym();
      my $qvar = Symbol::qualify($var);
    
      my $retval = {
          'sym' => $sym,
          'var' => $qvar,
      };
    
      return $retval;
    }
  
  
  There are only two arguments in the list, so stopping to look at each
  one isn't too time consuming, but consider the debugging pace if we
  had a large list of 100 or so entries.  It is possible to customize
  breakpoints by specifying a condition.  Each time a breakpoint is
  reached, the condition is evaluated, stopping only if the condition is
  true.  In the session below, the C<window> command shows breakable
  lines and we set a breakpoint at line 7 with the condition C<$arg eq
  'later'>.  As we continue, the breakpoint is skipped when C<$arg> has
  the value of I<now> but not when it has the value of I<later>:
  
    % perl -d test.pl
    
    My::World::(test.pl:5): my @args = qw(now later);
      DB<1> w
    2 
    3:      use Symbol ();
    4 
    5==>    my @args = qw(now later);
    6:      for my $arg (@args) {
    7:          my $stuff = do_work($arg);
    8:          if ($stuff) {
    9:              print "do your work $arg\n";
    10          }
    11      }
  
  The C<==E<gt>> symbol shows us the line of code that's about to be
  executed.
  
      DB<1> b 7 $arg eq 'later'
      DB<2> c
    do your work now
    My::World::(test.pl:7):     my $stuff = do_work($arg);
      DB<2> n
    My::World::(test.pl:8):     if ($stuff) {
      DB<2> x $stuff
    0  HASH(0x82b90e4)
       'sym' => GLOB(0x82b9138)
          -> *Symbol::GEN1
       'var' => 'My::World::later'
      DB<5> c
    do your work later
    Debugged program terminated.  Use q to quit or R to restart,
  
  There are plenty more tricks left to pull from the perldb bag, but you
  should now understand enough about the debugger to try them on your
  own with the perldebug manpage by your side.  Quick online help from
  inside the debugger can be reached by typing the C<h> command.  It
  will display a list of the most useful commands and a short
  explanation of what they do.
  
  =head2 Interactive Perl Debugging under mod_cgi
  
  C<Devel::ptkdb> is a visual Perl debugger that uses perlTk for the
  user interface and requires a windows system like X-Windows or Windows
  to run.
  
  To debug a plain perl script with ptkdb, invoke it as:
  
    % perl -d:ptkdb myscript.pl
  
  The Tk application will be loaded.  Now you can do most of the
  debugging you did with the command line Perl debugger, but using a
  simple GUI to set/remove breakpoints, browse the code, step through it
  and more.
  
  With the help of ptkdb you can debug your CGI scripts running under
  mod_cgi.  Be sure that the web server's Perl installation includes
  the Tk package.  In order to enable the debugger you should change
  your "shebang" line from
  
    #! /usr/local/bin/perl -Tw
  
  to
  
    #! /usr/local/bin/perl -Twd:ptkdb
  
  You can debug scripts remotely if you're using a Unix based server and
  if the machine where you are writing the script has an X-server.  The
  X-server can be another Unix workstation, or a Macintosh or Win32
  platform with an appropriate X-Windows package.  You must insert the
  following C<BEGIN> subroutine into your script:
  
    BEGIN {
      $ENV{'DISPLAY'} = "myHostname:0.0" ;
    }
  
  You can use either the IP (I<123.123.123.123:0.0>) or the DNS
  convention (I<myhost.com:0.0>).  You must be sure that your web server
  has permission to open windows on your X-server (see the I<xhost>
  manpage for more info).
  
  Access the web page with the browser and I<Submit> the script as
  normal.  The ptkdb window should appear on the monitor if you have
  correctly set the C<$ENV{'DISPLAY'}> variable.  At this point you can
  start debugging your script.  Be aware that the browser may timeout
  waiting for the script to run.
  
  To expedite debugging you may want to set your breakpoints in advance
  with a I<.ptkdbrc> file and use the C<$DB::no_stop_at_start> variable.
  NOTE: for debugging web scripts you may have to have the I<.ptkdbrc>
  file installed in the server account's home directory (~www) or
  whatever username the webserver is running under.  Also try
  installing a I<.ptkdbrc> file in the same directory as the target
  script.
  
  META: insert snapshots of ptkdb screen
  
  ptkdb is not part of the standard perl distribution; it
  is available from CPAN: http://www.perl.com/CPAN/authors/id/A/AE/AEPAGE/ 
  
  
  
  
  
  
  =head2 Non-Interactive Perl Debugging under mod_perl
  
  
  To debug scripts running under mod_perl either use L<Apache::DB
  (interactive Perl debugging)|debug/Interactive_mod_perl_Debugging> or
  an older non-interactive method as described below.
  
  The C<NonStop> debugger option enables you to get some decent debugging
  information when running under mod_perl.  For example, before starting
  the server:
  
    % setenv PERL5OPT -d
    % setenv PERLDB_OPTS "NonStop=1 LineInfo=db.out AutoTrace=1 frame=2"
  
  Now watch db.out for line:filename info.  This is most useful for
  tracking those core dumps that normally leave us guessing, even with a
  stack trace from gdb.  I<db.out> will show you what Perl code
  triggered the core dump.  I<'man perldebug'> for more C<PERLDB_OPTS>.
  Note that Perl will ignore C<PERL5OPT> if C<PerlTaintCheck> is C<On>.
  
  =head2 Interactive mod_perl Debugging
  
  Now we'll turn to looking at how the interactive debugger is used in a
  mod_perl environment.  The C<Apache::DB> module available from CPAN
  provides a wrapper around C<perldb> for debugging Perl code running
  under mod_perl.
  
  The server must be run in non-forking mode to use the interactive
  debugger, this mode is turned on by passing the C<-X> flag to the
  httpd executable.  It is convenient to use an C<IfDefine> section
  around the C<Apache::DB> configuration, the example below does this
  using the name I<PERLDB>.  With this setup, debugging is only turned
  on when starting the server with the C<httpd -D PERLDB> command.
  
  This section should be at the top of the Perl configuration section of
  the configuration file, before any other Perl code is pulled in, so that
  debugging symbols will be inserted into the syntax tree, triggered by
  the call to C<Apache::DB-E<gt>init>.  The C<Apache::DB::handler> can
  be configured using any of the C<Perl*Handler> directives, in this
  case you use a C<PerlFixupHandler> so handlers in the response phase
  will bring up the debugger prompt:
  
    <IfDefine PERLDB>
    
      <Perl>
        use Apache::DB ();
        Apache::DB->init;
      </Perl>
    
      <Location />
        PerlFixupHandler Apache::DB
      </Location>
    
    </IfDefine>
  
  Since we have used C</> as the argument to the C<Location> directive,
  the debugger will be invoked for any kind of request (even for static
  documents and images) but of course it will immediately quit unless
  there is some Perl module registered to handle these requests.
  
  In our first example, we will debug the standard C<Apache::Status>
  module, which is configured like this:
  
    PerlModule Apache::Status
    <Location /perl-status>
      PerlHandler Apache::Status
      SetHandler perl-script
    </Location>
  
  When the server is started with the debugging flag, a notice will be
  printed to the console:
  
    % httpd -X -D PERLDB
    [notice] Apache::DB initialized in child 950
  
  The debugger prompt will not be available until the first request is
  made, in our case to http://localhost/perl-status.  Once we are at the
  prompt, all the standard debugging commands are available.  First we
  run window to get some of the context for the code being debugged,
  then we move to the next statement after a value has been assigned to
  C<$r>, and finally we print the request URI.  If no breakpoints are
  set, the C<continue> command will give control back to Apache and the
  request will finish with the C<Apache::Status> main menu showing in
  the browser window:
  
    Loading DB routines from perl5db.pl version 1.0402
    Emacs support available.
    
    Enter h or `h h' for help.
    
    
Apache::Status::handler(/usr/lib/perl5/site_perl/5.005/i386-linux/Apache/Status.pm:55):
 
    55:         my($r) = @_;
      DB<1> w
    52      }
    53
    54      sub handler {
    55==>       my($r) = @_;
    56:         Apache->request($r); #for Apache::CGI
    57:         my $qs = $r->args || "";
    58:         my $sub = "status_$qs";
    59:         no strict 'refs';
    60
    61:         if($qs =~ s/^(noh_\w+).*/$1/) {
      DB<1> n
   
Apache::Status::handler(/usr/lib/perl5/site_perl/5.005/i386-linux/Apache/Status.pm:56):
    56:         Apache->request($r); #  for Apache::CGI
      DB<1> p $r->uri
    /perl-status
      DB<2> c
  
  All the techniques we saw while debugging plain perl scripts can be
  applied to this debugging session.
  
  Debugging C<Apache::Registry> scripts is somewhat different, because
  the handler routine does quite a bit of work before it reaches your
  script.  In this example, we make a request for C</perl/test.pl>, which
  consists of this code:
  
    use strict;
    
    my $r = shift;
    $r->send_http_header('text/plain');
    
    print "mod_perl rules";
  
  When a request is issued, the debugger stops at line 28 of
  I<Apache/Registry.pm>.  We set a breakpoint at line 140, which is the
  line that actually calls the script wrapper subroutine.  The
  C<continue> command will bring us to that line, where we can step into
  the script handler:
  
    
Apache::Registry::handler(/usr/lib/perl5/site_perl/5.005/i386-linux/Apache/Registry.pm:28):
  28:         my $r = shift;
      DB<1> b 140
      DB<2> c
    
Apache::Registry::handler(/usr/lib/perl5/site_perl/5.005/i386-linux/Apache/Registry.pm:140):
    140:            eval { &{$cv}($r, @_) } if $r->seqno;
      DB<2> s
    Apache::ROOT::perl::test_2epl::handler((eval 87):3):
    3:        my $r = shift;
  
  Notice the funny package name, that's generated from the URI of the
  request for namespace protection.  The filename is not displayed,
  since the code was compiled via eval(), but the C<print> command can
  be used to show you C<$r-E<gt>filename>:
  
      DB<2> n
    Apache::ROOT::perl::test_2epl::handler((eval 87):4):
    4:        $r->send_http_header('text/plain');
      DB<2> p $r->filename
    /home/httpd/perl/test.pl
  
  The line number might seem off too, but the window command will give
  you a better idea where you are:
  
      DB<4> w
    1:      package Apache::ROOT::perl::test_2epl;use Apache qw(exit);
    sub handler {  use strict;
    2 
    3:        my $r = shift;
    4==>      $r->send_http_header('text/plain');
    5 
    6:        print "mod_perl rules";
    7 
    8       }
    9       ;
  
  The code from the I<test.pl> file is between lines 2 and 7, the rest
  is the C<Apache::Registry> magic to cache your code inside a
  I<handler> subroutine.
  
  It will always take some practice and patience when putting together
  debugging strategies that make effective use of the interactive
  debugger for various situations.  Once you have a good strategy, bug
  squashing can actually be quite a bit of fun!
  
  
  
  
  
  
  
  
  
  
  
  
  
  =head2 ptkdb and Interactive mod_perl Debugging
  
  As you saw earlier you can use the C<ptkdb> visual debugger to
  debug CGI scripts running under mod_cgi.  But it won't work for
  mod_perl using the same configuration as used in mod_cgi.  We have to
  tweak the I<Apache/DB.pm> module to use I<Devel/ptkdb.pm> instead of
  I<Apache/perl5db.pl>.
  
  Open the file in your favorite editor and replace:
  
      require 'Apache/perl5db.pl';
  
  with:
  
      require 'Devel/ptkdb.pm';
  
  Now when you use the interactive mod_perl debugger configuration from
  the previous section and issue a request, the I<ptkdb> visual debugger
  will be loaded.
  
  If you are debugging C<Apache::Registry> scripts, as in the terminal
  debugging mode example, go to line 140 (or to whatever line the C<eval
  { &{$cv}($r, @_) } if $r-E<gt>seqno;> statement is located) and press the
  I<step in> button to start the debug of the script itself.
  
  Note that you can use Apache with C<ptkdb> in plain multi-server mode,
  you don't have to start C<httpd> with the C<-X> option.
  
  META: One caveat:
  
  When the request is completed, C<ptkdb> hangs.  Does anyone know what
  code should be registered for it to exit on completion?  To replace
  the original C<Apache::DB> cleanup code, as:
  
      if (ref $r) {
        $SIG{INT} = \&DB::catch;
        $r->register_cleanup(sub { 
            $SIG{INT} = \&DB::ApacheSIGINT();
        });
      }
  
  Any Perl/Tk guru to assist???
  
  
  
  
  
  
  =head2 Debugging when Server Crashes on Startup before Writing to Log File.
  
  If your server crashes on startup, you need to start it under gdb and
  ask it to generate a stack trace.
  
  I'll emulate a faulty server by starting a startup file with the dump()
  command:
  
    startup.pl
    ----------
    dump;
    1;
  
  and then requiring this file from the I<httpd.conf>:
  
    PerlRequire /path/to/startup.pl
  
  Make sure no server is running on port 80 or use an alternate config
  with an alternate port if using a production server.
  
    % gdb /path/to/httpd
    (gdb) set args -X
  
  Use:
  
    set args -X -f /path/to/alternate/serverconfig_ifneeded.conf
  
  if the server must be started from an alternative configuration
  file.
  
  Now run the program:
  
    (gdb) run
    
    Starting program: /usr/local/apache/bin/httpd -X
    
    Program received signal SIGABRT, Aborted.
    0x400da4e1 in __kill () from /lib/libc.so.6
  
  At this point the server should die because of the call to C<dump()>.
  When that happens we use C<bt> or C<where> to ask for a stack back
  trace.
  
    (gdb) where
    
    #0  0x400da4e1 in __kill () from /lib/libc.so.6
    #1  0x80d43bc in Perl_my_unexec ()
    #2  0x8119544 in Perl_pp_goto ()
    #3  0x8118990 in Perl_pp_dump ()
    #4  0x812b2ad in Perl_runops_standard ()
    #5  0x80d3a9c in perl_eval_sv ()
    #6  0x807ef1c in perl_do_file ()
    #7  0x807ef4f in perl_load_startup_script ()
    #8  0x807b7ec in perl_cmd_require ()
    #9  0x8092af7 in ap_clear_module_list ()
    #10 0x8092f43 in ap_handle_command ()
    #11 0x8092fd7 in ap_srm_command_loop ()
    #12 0x80933e0 in ap_process_resource_config ()
    #13 0x8093ca2 in ap_read_config ()
    #14 0x809db63 in main ()
    #15 0x400d41eb in __libc_start_main (main=0x809d8dc <main>, argc=2, 
        argv=0xbffffab4, init=0x80606f8 <_init>, fini=0x812b38c <_fini>, 
        rtld_fini=0x4000a610 <_dl_fini>, stack_end=0xbffffaac)
        at ../sysdeps/generic/libc-start.c:90
  
  If you do not know what this trace means, you could send it to the
  mod_perl mailing list to ask for help.  Make sure to include the
  version numbers of Apache, mod_perl and Perl, and use a subject line
  that says something about the problem rather than 'help'.
  
  In our case we already know that the server is supposed to die when
  compiling the startup file and we can clearly see that from the trace.
  We always read it from the bottom upward:
  
  We are in config file:
  
    #13 0x8093ca2 in ap_read_config ()
  
  We do require:
  
    #8  0x807b7ec in perl_cmd_require ()
  
  We load the file and compile it:
  
    #6  0x807ef1c in perl_do_file ()
    #5  0x80d3a9c in perl_eval_sv ()
  
  C<dump()> gets executed:
  
    #3  0x8118990 in Perl_pp_dump ()
  
  C<dump()> calls C<__kill()>:
  
    #0  0x400da4e1 in __kill () from /lib/libc.so.6
  
  
  
  =head1 Hanging Processes: Detection and Diagnostics
  
  Sometimes a httpd process might hang in the middle of processing a
  request, either because there is a bug in your code (e.g. the code is
  stuck in a while loop), it gets blocked by some system call or because
  of a resource deadlock) or for some other reason. In order to fix the
  problem we need to learn what circumstances the process hangs in
  (detection), so we can reproduce the problem and after that to
  discover why there is problem (diagnostics).
  
  
  
  
  =head2 Hanging because of the OS Problem
  
  Sometimes you can find a process hanging because of some kind of the
  system problem. For example if the processes was doing some disk IO
  operation it might get stuck in uninterruptable sleep (C<'D'> disk
  wait in ps(1) report, C<'U'> in top(1)) which indicates that either
  something is broken in your kernel or that you're using NFS. Or and
  you cannot S<kill -9> this process.
  
  Another process that cannot be killed with S<kill -9> is a zombie
  process (C<'Z'> disk wait in ps(1) report, C<E<lt>defuncE<gt>> in
  top(1)), in which case the process is already dead and Apache didn't
  wait on it properly.
  
  In the case of I<disk wait> you can actually get the I<wait> channel
  from ps(1) and look it up in your kernel symbol table to find out what
  resource it was waiting on.  It might point the way to what component
  of the system was misbehaving if the problem occurred frequently.
  
  
  
  =head2 An Example of Code that Might Hang a Process
  
  Deadlock is the situation where, for example, two processes, say X and
  Y, need two resources, A and B to continue.  X holds onto A and Y
  holds onto B.  There is no possibility for Y to continue before X
  releases A.  But X cannot release A before it gets Y.
  
  Look at the following example.  Your process has to gain a lock on
  some resource (e.g. a file) before it continues.  So it makes an
  attempt, and if that fails it sleep()s for a second and increments a
  counter:
  
    until(gain_lock()){
      $tries++;
      sleep 1;
    }
  
  Because there are many processes competing for this resource, or
  perhaps because there is a deadlock, gain_lock() always fails.
  The process is hung.
  
  Another situation that you may very often encounter is exclusive lock
  starvation.  Generally there are two lock types in use: I<SHARED>
  locks, which allow many processes to perform I<READ> operations
  simultaneously, and I<EXCLUSIVE> locks.  The latter permits access
  only by a single process and so makes a safe I<WRITE> operation
  possible.
  
  You can lock any kind of resource, although in our examples we will
  talk about files.
  
  If there is a I<READ> lock request, it is granted as soon as the file
  becomes unlocked or immediately if it is already I<READ> locked.  The
  lock status becomes I<READ> on success.
  
  If there is a I<WRITE> lock request, it is granted as soon as the file
  becomes unlocked.  Lock status becomes I<WRITE> on success.
  
  Normally it is the I<WRITE> lock request which is the most important. If
  the file is being I<READ> locked, a process that requests to write
  will poll until there are no reading or writing process left. However, lots
  of processes can successfully read the file, since they do not block
  each other from doing so.  This means that a process that wants to
  write to the file (first obtaining an exclusive lock) never gets a
  chance to squeeze in. The following diagram represents a possible
  scenario where everybody can read but no one can write:
  
    [-p1-]                 [--p1--]
       [--p2--]
     [---------p3---------]
                   [------p4-----]
       [--p5--]   [----p5----]
  
  Let's look at some real code and see it in action.  The following
  script imports flock() related parameters from the C<Fcntl> module,
  and opens a file that will be locked.  It then defines and sets two
  variables: C<$lock_type> and C<$lock_type_verbose>.  These are set to
  C<LOCK_EX> and C<EX> respectively if the first command line argument
  (C<$ARGV[0]) is defined and equal to I<w>.  This indicates that this
  process will try to gain a I<WRITE> (exclusive) lock.  Otherwise the
  two are set to C<LOCK_SH> and <SH> for a I<SHARED> (read) lock.
  
  Once the variables are set, we enter the infinite C<while(1)> loop
  that attempts to lock the file by the mode set in C<$lock_type>.  It
  report success and the type of lock that was gained, then it sleeps
  for a random period between 0 and 9 seconds and unlocks the file.  The
  loop then starts from the beginning.
  
    lock.pl
    -------------------
    #!/usr/bin/perl -w
    use Fcntl qw(:flock);
    
    $lock = "/tmp/lock";
    
    open LOCK, ">$lock" or die "Cannot open $lock for writing: $!";
    my $lock_type         = LOCK_SH;
    my $lock_type_verbose = 'SH';
    if (defined $ARGV[0] and $ARGV[0] eq 'w'){
      $lock_type         = LOCK_EX;
      $lock_type_verbose = 'EX';
    }
    
    while(1){
      flock LOCK,$lock_type;
        # start of critical section
      print "$$: $lock_type_verbose\n";
      sleep int(rand(10));
        # end of critical section
      flock LOCK, LOCK_UN;
    }
    close LOCK;
  
  
  It's very easy to see I<WRITE> process starvation if you spawn a few
  of the above scripts simultaneously.  Start the first few as I<READ>
  processes and then start one I<WRITE> process like this:
  
   % ./lock.pl r & ; ./lock.pl r & ; ./lock.pl r & ; ./lock.pl w &
  
  You see something like:
  
    24233: SH
    24232: SH
    24232: SH
    24233: SH
    24232: SH
    24233: SH
    24231: SH
    24231: SH
    24231: SH
  
  and not a single C<EX> line... When you kill off the reading
  processes, then the write process will gain its lock.  Note that as
  this is a rough example, I used the sleep() function.  To simulate a real
  situation you need to use the C<Time::HiRes> module, which allows you
  to choose more precise intervals to sleep.
  
  The interval between lock and unlock is called a I<Critical Section>,
  which should be kept as short as possible (in terms of the time taken to
  execute the code, and not in terms of the number of lines of code).  As you
  just saw, a single sleep statement can make the critical section long.
  
  To summarize, if you have a script that uses both I<READ> and I<WRITE>
  locks and the critical section isn't very short, the writing process
  might be starved.  After a while a browser that initiated this request
  will timeout the connection and abort the request, but it's much more
  likely that user will press the I<Stop> or I<Reload> button before
  that happens.  Since the process in question is just waiting, there is
  no way for Apache to know that the request was aborted.  It will hang
  until the lock is gained.  Only when a write to a client's broken
  connection is attempted will Apache terminate the script.
  
  
  
  
  
  =head2 Detecting hanging processes
  
  It's not so easy to detect hanging processes.  There is no way you can
  tell how long the request is taking to process by using plain system
  utilities like ps() and top().  The reason is that each Apache process
  serves many requests without quitting.  System utilities can tell how
  long the process has been running since its creation, but this
  information is useless in our case, since Apache processes normally
  run for extended periods.
  
  However there are a few approaches that can help to detect a hanging
  process.
  
  If the process hangs and demands lots of resources it's quite easy to
  spot it by using the top() utility.  You will see the same process
  show up in the first few lines of the automatically refreshed report.
  But often the hanging process uses few resources, e.g. when waiting
  for some event to happen.
  
  Another easy case is when some process thrashes the I<error_log>,
  writing millions of error messages there.  Generally this process uses
  lots of resources and is also easily spotted by using top().
  
  There are other tools that report the status of Apache processes.
  
  =over 4
  
  =item * The mod_status module, which is usually accessed from the
  I</server_status> location.
  
  =item * The C<Apache::VMonitor> module.
  
  =back
  
  Both tools provide counters of processed requests per Apache process.
  
  You can watch the report for a few minutes, and try to spot any
  process which has the same number of processed requests while its
  status is 'W' (waiting).  This means that it has hung.
  
  But if you have fifty processes, it can be quite hard to spot such a
  process.  L<Apache::Watchdog::RunAway is a hanging processes monitor
  and terminator|modules/Apache_Watchdog_RunAway_Hang> that implements
  this feature and should be used to solve this kind of problem.
  
  If you've got a real problem, and the processes hang one after the
  other, the time will come when the number of hanging processes is
  equal to the value of C<MaxClients>.  This means that no more
  processes will be spawned.  As far as the users are concerned your
  server is down.  It is easy to detect this situation, attempt to
  resolve it and notify the administrator using a simple crontab
  watchdog that requests some very light script periodically.  (See
  L<Monitoring the Server. A
  watchdog.|control/Monitoring_the_Server_A_watchdo>)
  
  In the watchdog you set a timeout appropriate for your service, which
  may be anything from a few seconds to a few minutes.  If the server
  fails to respond before the timeout expires, the watchdog has spotted
  trouble and attempts to restart the server.  After a restart an email
  report is sent to the administrator saying that there was a problem
  and whether or not the restart was successful.
  
  If you get such reports constantly something is wrong with your web
  service and you should revise your code.  Note that it's possible that
  your server is being overloaded by more requests than it can handle,
  so the requests are being queued and not processed for a while, which
  triggers the watchdog's alarm.  If this is a case you may need to add
  more servers or more memory, or perhaps split your single machine
  across a cluster of machines.
  
  
  
  
  =head2 Determination of the reason
  
  Given the process id (PID), there are three ways to find out where the
  server is hanging.
  
  =over
  
  =item 1
  
  Deploying the Perl calls tracing mechanism. This will allow to spot
  the location of the Perl code that has triggered the problem.
  
  =item 2
  
  Using the system calls tracing utilities, like strace(1) or
  truss(1). This approach reveals low level details about a potential
  misbehavior of some part of the system.
  
  =item 3
  
  Using an interactive debugger, like gdb(1). When the process is stuck,
  and you don't know what it was doing just before it has got stuck,
  with gdb you can attach to this process and print its calls stack, to
  reveal where the last call was made from. Just like with strace or
  truss you see the system call trace and not the Perl calls.
  
  =back
  
  
  =head3 Using the Perl Trace
  
  To see where an httpd is "spinning", try adding this to your script or
  a startup file:
  
    use Carp ();
    $SIG{'USR2'} = sub { 
       Carp::confess("caught SIGUSR2!");
    };
  
  The above code assigns a signal handler for the C<USR2> signal. This
  signal has been chosen because it's least likely to be used by the
  other parts of the server.
  
  We check the registered signal handlers with help of
  
L<Apache::Status|debug/Apache__Status____Embedded_Interpreter_Status_Information>.
  What we see at
  http://localhost/perl-status?sig is :
  
    USR2 = \&MyStartUp::__ANON__
  
  C<MyStartUp> is the name of the package I've used in mine
  I<startup.pl>.
  
  After applying this server configuration, let's use this simple code
  example, where sleep(10000) will emulate a hanging process:
  
    debug/perl_trace.pl
    -------------------
    $|=1;
    print "Content-type:text/plain\r\n\r\n";
    print "[$$] Going to sleep\n";
    hanging_sub();
    sub hanging_sub {sleep 10000;}
  
  We execute the above script as
  I<http://localhost/perl/debug/perl_trace.pl>, we have used C<$|=1;>
  and printed the PID with C<$$> to learn what process ID we want to
  work with.
  
  No we issue the command line, using the PID we have just saw being
  printed to the browser's window:
  
    % kill -USR2 PID
  
  And watch this showing up at the I<error_log> file:
  
    caught SIGUSR2!
        at /home/httpd/perl/startup/startup.pl line 32
    MyStartUp::__ANON__('USR2') called 
        at /home/httpd/perl/debug/perl_trace.pl line 5
    Apache::ROOT::perl::debug::perl_trace_2epl::hanging_sub() called 
        at /home/httpd/perl/debug/perl_trace.pl line 4
    
Apache::ROOT::perl::debug::perl_trace_2epl::handler('Apache=SCALAR(0x8309d08)') 
      called 
        at /usr/lib/perl5/site_perl/5.005/i386-linux/Apache/Registry.pm 
          line 140
    eval {...} called 
        at /usr/lib/perl5/site_perl/5.005/i386-linux/Apache/Registry.pm
          line 140
    Apache::Registry::handler('Apache=SCALAR(0x8309d08)') called 
        at PerlHandler subroutine `Apache::Registry::handler' line 0
    eval {...} called 
        at PerlHandler subroutine `Apache::Registry::handler' line 0
  
  We can clearly see that the process "hangs" in the code executed at
  line 5 of the I</home/httpd/perl/debug/perl_trace.pl> script, and it
  was called by the hanging_sub() routine defined at line 4.
  
  
  
  
  =head3 Using the System Calls Trace
  
  Depending on the operating system you should have one of the
  C<truss(1)> or C<strace(1)> utilities available.  In the following
  examples we will use C<strace(1)>.
  
  There are two ways to get the trace of the process with strace(1)
  (similar to gdb(1)). The first one is to tell strace(1) to start the
  process and do the tracing on it:
  
    % strace perl -le 'print "mod_perl rules"'
  
  The second is tell strace(1) to attach to the process that's already
  running. You need to know the PID of the process.
  
    % strace -p PID
  
  Replace PID with the process number you want to check on.
  
  There are many more useful arguments accepted by strace(1) that you
  might find useful. For example you can tell it to trace only specific
  system calls:
  
    % strace -e trace=open,write,close,nanosleep \
        perl -le 'print "mod_perl rules"'
  
  In this example we have asked strace(1) to show us only the I<open>,
  I<write>, I<close>, I<nanosleep> which simplifies the observing of the
  output generated by strace(1) if you know what you are looking for.
  
  Let's write a mod_perl script that hangs, and deploy C<strace(1)> to
  find the point it hangs at:
  
    hangme.pl
    ---------
    $|=1;
    my $r = shift;
    $r->send_http_header('text/plain');
    
    print "PID = $$\n";
    
    while(1){
      $i++;
      sleep 1;
    }
  
  The reason this simple code hangs is obvious.  It never breaks from
  the while loop.  As you have noticed, it prints the PID of the current
  process to the browser.  Of course in a real situation you cannot use
  the same trick.  In the previous section I have presented a few ways
  to detect the runaway processes and their PIDs.
  
  I save the above code in a file and execute it from the browser. Note
  that I've made STDOUT unbuffered with C<$|=1;> so I will immediately
  see the process ID.  Once the script is requested, the script prints
  the process PID and obviously hangs.  So we press the C<'Stop'>
  button, but the process continues to hang in this code.  Isn't apache
  supposed to detect the broken connection and abort the request?
  I<Yes> and I<No>, you will understand soon what's really happening.
  
  First let's attach to the process and see what it's doing.  I use the
  PID the script printed to the browser, which is 10045 in this case:
  
    % strace -p 10045
    
    [...truncated identical output...]
    SYS_175(0, 0xbffff41c, 0xbffff39c, 0x8, 0) = 0
    SYS_174(0x11, 0, 0xbffff1a0, 0x8, 0x11) = 0
    SYS_175(0x2, 0xbffff39c, 0, 0x8, 0x2)   = 0
    nanosleep(0xbffff308, 0xbffff308, 0x401a61b4, 0xbffff308, 0xbffff41c) = 0
    time([940973834])                       = 940973834
    time([940973834])                       = 940973834
    [...truncated the identical output...]
  
  It isn't what we expected to see, is it?  These are some system calls
  we don't see in our little example.  What we actually see is how Perl
  translates our code into system calls.  Since we know that our code
  hangs in this snippet:
  
    while(1){
      $i++;
      sleep 1;
    }
  
  We I<"easily"> figure out that the first three system calls implement
  the C<$i++>, while the other three are responsible for the
  C<sleep 1> call.
  
  Generally the situation is the reverse of our example.  You detect the
  hanging process, you attach to it and watch the trace of calls it does
  (or the last few commands if the process is hanging waiting for
  something, e.g. when blocking on a file lock request).  From watching
  the trace you figure out what it's actually doing, and probably find
  the corresponding lines in your Perl code.  For example let's see how
  one process I<"hangs"> while requesting an exclusive lock on a file
  exclusively locked by another process:
  
    excl_lock.pl
    ---------
    use Fcntl qw(:flock);
    use Symbol;
    
    if ( fork() ) {
      my $fh = gensym;
      open $fh, ">/tmp/lock" or die "cannot open /tmp/lock $!";
      print "$$: I'm going to obtain the lock\n";
      flock $fh, LOCK_EX;
      print "$$: I've got the lock\n";
      sleep 20;
      close $fh;
    
    } else {
      my $fh = gensym;
      open $fh, ">/tmp/lock" or die "cannot open /tmp/lock $!";
      print "$$: I'm going to obtain the lock\n";
      flock $fh, LOCK_EX;
      print "$$: I've got the lock\n";
      sleep 20;
      close $fh;
    }
  
  The code is simple.  The process executing the code forks a second
  process, and both do the same thing: generate a unique symbol to be
  used as a file handler, open the lock file for writing using the
  generated symbol, lock the file in exclusive mode, sleep for 20
  seconds (pretending to do some lengthy operation) and close the lock
  file, which also unlocks the file.
  
  The C<gensym> function is imported from the C<Symbol> module.  The
  C<Fcntl> module provides us with a symbolic constant C<LOCK_EX>.  This
  is imported via the C<:flock> tag, which imports this and other
  flock() constants.
  
  The code used by both processes is identical, therefore we cannot
  predict which one will get its hands on the lock file and succeed in
  locking it first, so we add print() statements to find the PID of the
  process blocking (waiting to get the lock) on a lock request.
  
  When the above code executed from the command line, we see that one of
  the processes gets the lock:
  
    % ./excl_lock.pl
    
    3038: I'm going to obtain the lock
    3038: I've got the lock
    3037: I'm going to obtain the lock
  
  Here we see that process 3037 is blocking, so we attach to it:
  
    % strace -p 3037
    
    about to attach c10
    flock(3, LOCK_EX
  
  It's clear from the above trace, that the process waits for an exclusive
  lock. (Note, that the missing closing parentheses is not a typo!)
  
  As you become familiar with watching the traces of different
  processes, you will understand what is happening more easily.
  
  =head3 Using the Interactive Debugger
  
  Another approach to see a trace of the running code is to use a
  debugger such as C<gdb> (the GNU debugger).  It's supposed to work on
  any platform which supports the GNU development tools.  Its purpose is
  to allow you to see what is going on I<inside> a program while it
  executes, or what it was doing at the moment it crashed.  
  
  To trace the execution of a process, C<gdb> needs to know the process
  id (PID) and the path to the binary that the process is executing.
  For Perl code it's I</usr/bin/perl> (or whatever is the path to your
  Perl), for httpd processes it will be the path to your httpd
  executable.
  
  Here are a few examples using gdb.
  
  Let's go back to our last locking example, execute it as before and
  attach to the process that didn't get the lock:
  
    % gdb /usr/bin/perl 3037
  
  After starting the debugger we execute the C<where> command to see the
  trace:
  
    (gdb) where
    #0  0x40131781 in __flock ()
    #1  0x80a5421 in Perl_pp_flock ()
    #2  0x80b148d in Perl_runops_standard ()
    #3  0x80592b8 in perl_run ()
    #4  0x805782f in main ()
    #5  0x400a6cb3 in __libc_start_main (main=0x80577c0 <main>, argc=2, 
        argv=0xbffff7f4, init=0x8056af4 <_init>, fini=0x80b14fc <_fini>, 
        rtld_fini=0x4000a350 <_dl_fini>, stack_end=0xbffff7ec)
        at ../sysdeps/generic/libc-start.c:78
  
  That's not what we expected to see and now it's a different trace.
  C<#0> tells us the most recent call that was executed, which is a C
  language flock() implementation.  But the previous call (C<#1>) isn't
  print(), as we would expect, but a higher level of Perl's internal
  flock().  If we follow the trace of calls what we actually see is an
  Opcodes tree, which can be better presented as:
  
    __libc_start_main
      main ()
        perl_run () 
          Perl_runops_standard ()
            Perl_pp_flock ()
              __flock ()
  
  So I would say that it's less useful than C<strace>, since if there
  are several flock()s it's almost impossible to know which of them was
  called.  This problem is solved by C<strace>, which shows the sequence
  of the system calls executed.  Using this sequence we can locate the
  corresponding lines in the code.
  
  (META: the above is wrong - you can ask to display the previous
  command executed by the program (not gdb)! What is it?)
  
  When you attach to a running process with debugger, the program stops
  executing and control of the program is passed to the debugger.  You
  can continue the normal program run with the C<continue> command or
  execute it step by step with the C<next> and C<step> commands which
  you type at the C<gdb> prompt.  (C<next> steps over any function calls
  in the line, while C<step> steps into them).
  
  C/C++ debuggers are a very large topic and beyond the scope of this
  document, but the gdb man page is quite good and you can try C<info
  gdb> as well.  You might also want to check the C<ddd> (Data Display
  Debugger) which provides a visual interface to C<gdb> and other
  debuggers.  It even knows how to debug Perl programs!
  
  For completeness, let's see the gdb trace of the httpd process that's
  still hanging in the C<while(1)> loop of the first example in this
  section:
  
    % gdb /usr/local/apache/bin/httpd 1005
    
    (gdb) where
    #0  0x4014a861 in __libc_nanosleep ()
    #1  0x4014a7ed in __sleep (seconds=1) at 
../sysdeps/unix/sysv/linux/sleep.c:78
    #2  0x8122c01 in Perl_pp_sleep ()
    #3  0x812b25d in Perl_runops_standard ()
    #4  0x80d3721 in perl_call_sv ()
    #5  0x807a46b in perl_call_handler ()
    #6  0x8079e35 in perl_run_stacked_handlers ()
    #7  0x8078d6d in perl_handler ()
    #8  0x8091e43 in ap_invoke_handler ()
    #9  0x80a5109 in ap_some_auth_required ()
    #10 0x80a516c in ap_process_request ()
    #11 0x809cb2e in ap_child_terminate ()
    #12 0x809cd6c in ap_child_terminate ()
    #13 0x809ce19 in ap_child_terminate ()
    #14 0x809d446 in ap_child_terminate ()
    #15 0x809dbc3 in main ()
    #16 0x400d3cb3 in __libc_start_main (main=0x809d88c <main>, argc=1, 
        argv=0xbffff7e4, init=0x80606f8 <_init>, fini=0x812b33c <_fini>, 
        rtld_fini=0x4000a350 <_dl_fini>, stack_end=0xbffff7dc)
        at ../sysdeps/generic/libc-start.c:78
  
  As before we can see a complete trace of the last executed call.
  
  As you have noticed, I still haven't explained why the process hanging
  in the C<while(1)> loop isn't aborted by Apache.  The next section
  covers this.
  
  META: add the note about using the 'curinfo' gdb macro to perform an
  easy detecting of the hanging location.
  
  
  
  =head1 Debugging Hanging processes (continued)
  
  META: incomplete
  
  mod_perl comes with a number of useful of gdb macros to ease the debug
  process.  You will find the file with macros in the mod_perl source
  distribution in the I<.gdbinit> file (mod_perl-x.xx/.gdbinit).  You
  might want to modify the macro definitions.
  
  In order to use this you need to compile mod_perl with
  C<PERL_DEBUG=1>.
  
  To debug the server, start it:
  
    % httpd -X
  
  Issue a request to the offending script that hangs.  Find the PID number
  of the process that hangs.
  
  Go to the server root:
  
    % cd /usr/local/apache
  
  Now attach to it with gdb (replace the C<PID> with the actual process
  id) and load the macros from I<.gdbinit>:
  
    % gdb /path/to/httpd PID
    % source /usr/src/mod_perl-x.xx/.gdbinit
  
  Now you can start the server (I<httpd> below is a gdb macro):
  
    (gdb) httpd
  
  Now run the C<curinfo> macro:
  
    (gdb) curinfo
  
  It should tell you the line/filename of the offending Perl code.
  
  Add this to I<.gdbinit>:
  
    define longmess
      set $sv = perl_eval_pv("Carp::longmess()", 1)
      printf "%s\n", ((XPV*) ($sv)->sv_any )->xpv_pv
    end
  
  and when you reload the macros, run:
  
    (gdb) longmess
  
  to produce a Perl stacktrace.
  
  
  
  
  =head2 Debugging core Dumping Code
  
  
  
         $ perl -e dump
          Abort(coredump)
  
  
  META: should I move the C<Apache::StatINC> here? (I think not, since it
  relates to other topics like reloading config files, but you should
  mention it here with a pointer to it)
  
  
  
  
  
  
  =head1 PERL_DEBUG=1 Build Option
  
  Building mod_perl with C<PERL_DEBUG=1>:
  
    perl Makefile.PL PERL_DEBUG=1
  
  will:
  
  =over
  
  =item 1 
  
  Add `-g' to EXTRA_CFLAGS
  
  =item 1 
  
  Turn on PERL_TRACE
  
  =item 1 
  
  Set PERL_DESTRUCT_LEVEL=2
  
  =item 1 
  
  Link against C<libperld> if -e 
$Config{archlibexp}/CORE/libperld$Config{lib_ext}
  
  =back
  
  
  
  
  
  
  
  
  
  =head1 Apache::Debug
  
  (META: to be written)
  
    use Apache::Debug ();
    Apache::Debug::dump($r, SERVER_ERROR, "Uh Oh!");
  
  This module sends what may be helpful debugging information to the
  client rather than to I<error_log>.
  
  Also, you could try using a larger emergency pool, try this instead of
  Apache::Debug:
  
   $^M = 'a' x (1<<18);  #256k buffer
   use Carp ();
   $SIG{__DIE__} = \&Carp::confess;
   eval { Carp::confess("init") };
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  =head1 Debug Tracing
  
  To enable mod_perl debug tracing, configure mod_perl with the
  PERL_TRACE option:
  
   perl Makefile.PL PERL_TRACE=1
  
  The trace levels can then be enabled via the C<MOD_PERL_TRACE>
  environment variable which can contain any combination of:
  
  =over 4
  
  =item c
  
  Trace directive handling during I<Apache> (non-mod_perl)
  B<c>onfiguration directive handling.  (Startup.)
  
  =item d
  
  Trace directive handling during I<mod_perl> B<d>irective processing
  during configuration read.  (Startup.)
  
  =item s
  
  Trace processing of I<E<lt>PerlE<gt>> B<s>ections.  (Startup.)
  
  =item h
  
  Trace Perl B<h>andler callbacks.  (RunTime.)
  
  =item g
  
  Trace B<g>lobal variable handling, interpreter construction, C<END>
  blocks, etc.  (RunTime.)
  
  =item all
  
  B<all> of the options listed above.  (Startup + RunTime.)
  
  =back
  
  One way of setting this variable is by adding this directive to
  I<httpd.conf>:
  
    PerlSetEnv MOD_PERL_TRACE all
  
  For example if you want to see a trace of the C<PerlRequire> and
  C<PerlModule> directives as they are executed, use:
  
    PerlSetEnv MOD_PERL_TRACE d
  
  Of course you can use the command line environment setting:
  
    % setenv MOD_PERL_TRACE all
    % httpd -X
  
  
  
  
  
  
  
  
  
  =head1 gdb says there are no debugging symbols
  
  During I<make install> Apache strips all the debugging symbols. To
  prevent this you should use I<--without-execstrip> C<./configure>
  option. So if you configure Apache via mod_perl, you should do:
  
    panic% perl Makefile.PL USE_APACI=1 \
      APACI_ARGS='--without-execstrip' [other options]
  
  Alternatively you can copy the unstripped binary manually. For example
  we did:
  
    panic# cp apache_1.3.17/src/httpd /home/httpd/httpd_perl/bin/httpd_perl
  
  As you know you need an unstripped executable to be able to debug it.
  While you can compile mod_perl with C<-g> (or C<PERL_DEBUG=1>), the
  Apache C<install> strips the symbols.
  
  I<Makefile.tmpl> contains a line:
  
    IFLAGS_PROGRAM  = -m 755 -s 
  
  Removing the -s does the trick (If you cannot find it in
  I<Makefile.tmpl> do it directly in I<Makefile>). Alternatively you
  rerun C<make> and copy the unstripped httpd binary away.
  
  
  
  
  
  
  
  
  
  
  
  =head1 Debugging Signal Handlers ($SIG{FOO})
  
  The current Perl implementation does not restore the original Apache C
  handler when you use the C<local $SIG{FOO}> clause.  While the
  save/restore of C<$SIG{ALRM}> was fixed in mod_perl 1.19_01 (CVS
  version), other signals are not yet fixed.  The real fix should
  probably be in Perl itself.
  
  Until recently C<local $SIG{ALRM}> restored the C<SIGALRM> handler to
  Perl's handler, not the handler it was in the first place (Apache's
  C<alrm_handler()>).  If you build mod_perl with C<PERL_TRACE=1> and set
  the C<MOD_PERL_TRACE> environment variable to B<g>, you will see this in
  the I<error_log> file:
  
    mod_perl: saving SIGALRM (14) handler 0x80b1ff0
    mod_perl: restoring SIGALRM (14) handler from: 0x0 to: 0x80b1ff0
  
  If nobody has touched C<$SIG{ALRM}>, C<0x0> will be the same address as
  the others.
  
  If you work with signal handlers you should take a look at the
  C<Sys::Signal> module, which solves the problem:
  
  C<Sys::Signal> - Set signal handlers with restoration of the existing
  C sighandler.  Get it from L<CPAN|download/Perl>.
  
  The usage is simple.  If the original code was:
  
    # If a timeout happens and C<SIGALRM> is thrown, the alarm() will be
    # reset, otherwise C<alarm 0> is reached and timer is reset as well.
    eval {
      local $SIG{ALRM} = sub { die "timeout\n" };
      alarm $timeout;
      ... db stuff ...
      alarm 0;
    };
    die $@ if $@;
  
  
  Now you would write:
  
    use Sys::Signal ();
    eval {
      my $h = Sys::Signal->set(ALRM => sub { die "timeout\n" });
      alarm $timeout;
      ... do something that may timeout ...
      alarm 0;
    };
    die $@ if $@;
  
  This should be fixed in Perl 5.6.1, so if you use this version of
  Perl, chances are that you don't need to use C<Sys::Signal>.
  
  mod_perl tries to deal only with those signals that cause conflict
  with Apache's.  Currently this is only C<SIGALRM>.  If there is
  another one that gives you trouble, you can add it to the list in
  I<perl_config.c> after I<"ALRM">, before I<NULL>.
  
    static char *sigsave[] = { "ALRM", NULL };
  
  
  
  
  
  
  
  =head1 Code Profiling
  
  (Meta: duplication??? I've started to write about profiling somewhere
  in this file)
  
  It is possible to profile code run under mod_perl with the
  C<Devel::DProf> module available on CPAN.  However, you must have
  apache version 1.3b3 or higher and the C<PerlChildExitHandler>
  enabled.  When the server is started, C<Devel::DProf> installs an
  C<END> block (to write the C<tmon.out> file) which will be run when
  the server is shutdown.  Here's how to start and stop a server with
  the profiler enabled:
  
   % setenv PERL5OPT -d:DProf
   % httpd -X -d `pwd` &
   ... make some requests to the server here ...
   % kill `cat logs/httpd.pid`
   % unsetenv PERL5OPT
   % dprofpp
  
  See also: C<Apache::DProf>
  
  
  
  
  
  
  
  
  
  =head1 Devel::Peek
  
  Devel::Peek - A data debugging tool for the XS programmer
  
  Let's see an example of Perl allocating a buffer only once, regardless
  of my() scoping, although it will realloc() if the size is bigger than
  C<SvLEN>:
  
    use Devel::Peek;
    
    for (1..3) {
        foo();
    }
    
    sub foo {
        my $sv;
        Dump $sv;
        $sv = 'x' x 100_000;
        $sv = "";
    }
  
  The output:
  
    SV = NULL(0x0) at 0x8138008
      REFCNT = 1
      FLAGS = (PADBUSY,PADMY)
    SV = PV(0x80e5794) at 0x8138008
      REFCNT = 1
      FLAGS = (PADBUSY,PADMY)
      PV = 0x815f808 ""\0
      CUR = 0
      LEN = 100001
    SV = PV(0x80e5794) at 0x8138008
      REFCNT = 1
      FLAGS = (PADBUSY,PADMY)
      PV = 0x815f808 ""\0
      CUR = 0
  
  We can see that on the second and subsequent calls C<$sv> already has
  previously allocated memory.
  
  So, if you can afford the memory, a larger buffer means fewer C<brk()>
  syscalls.  If you watch that example with C<strace> you will only see
  calls to C<brk()> the first time through the loop.  So this is a case
  where your module might want to pre-allocate the buffer like this:
  
    package Your::Proxy;
    
    my $buffer = ' ' x 100_000;
    $buffer = "";
  
  Now only the parent has to brk() at server startup, each child already
  will already have an allocated buffer.  Just reset to "" when you are
  done.
  
  Note: Previously allocating a scalar in this
  way saves reallocation in v5.005 but may not do so in other versions.
  
  
  
  
  
  
  
  =head1 How can I find out if a mod_perl code has a memory leak
  
  The C<Apache::Leak> module (derived from C<Devel::Leak>) should help
  you detecting the leakages in your code. For example:
  
    leaktest.pl
    -----------
    use Apache::Leak;
    
    my $global = "FooAAA";
    
    leak_test {
      $$global = 1;
      ++$global;
    };
  
  The argument to C<leak_test()> is an anonymous sub, so you can just
  throw it any code you suspect might be leaking.  Beware, it will run
  the code twice!  The first time in, new C<SV>s are created, but does
  not mean you are leaking.  The second pass will give better evidence.
  You do not need to be inside mod_perl to use it.  From the command
  line, the above script outputs:
  
    ENTER: 1482 SVs
    new c28b8 : new c2918 : 
    LEAVE: 1484 SVs
    ENTER: 1484 SVs
    new db690 : new db6a8 : 
    LEAVE: 1486 SVs
    !!! 2 SVs leaked !!!
  
  Build a debuggable Perl to see dumps of the C<SV>s.  The simple way to
  have both a normal Perl and debuggable Perl is to follow hints in the
  C<SUPPORT> doc for building C<libperld.a>.  When that is built, copy
  the C<perl> from that directory to your Perl bin directory, but name
  it C<dperl>.
  
  Our example's leak explanation: C<$$global = 1;> : new global variable
  C<FooAAA> created with value of C<1>, this will not be destroyed until
  this module is destroyed.  Under mod_perl the module doesn't get
  destroyed until the process quits.
  
  C<Apache::Leak> is not very user-friendly, have a look at
  C<B::LexInfo>.  It is possible to see something that might appear to
  be a leak, but is actually just a Perl optimization. e.g. consider
  this code:
  
    sub foo {
      my $string = shift;
    }
  
    foo("a string");
  
  C<B::LexInfo> will show you that Perl does not release the value from
  $string, unless you undef() it.  This is because Perl anticipates the
  memory will be needed for another string, the next time the subroutine
  is entered.  You'll see similar behaviour for C<@array> length,
  C<%hash> keys, and scratch areas of the pad-list for OPs such as
  C<join()>, `C<.>', etc.
  
  C<Apache::Status> includes a C<StatusLexInfo> option which can show
  you the internals of your code.
  
  
  
  
  
  
  
  
  =head1 Debugging your code in Single Server Mode
  
  Running in httpd -X mode is good only for testing during the
  development phase.
  
  You want to test that your application correctly handles global
  variables (if you have any - the less you have of them the better of
  course - but sometimes you just can't do without them).  It's hard to
  test with multiple servers serving your cgi since each child has a
  different value for its global variables.  Imagine that you have a
  C<random()> sub that returns a random number and you have the
  following script.
  
    use vars qw($num);
    $num ||= random();
    print ++$num;
  
  This script initializes the variable C<$num> with a random value, then
  increments it on each request and prints it out.  Running this script
  in a multiple server environments will result in something like C<1>,
  C<9>, C<4>, C<19> (a different number each time you hit the browser's
  reload button) since each time your script will be served by a
  different child.  (On some operating systems, e.g. AIX, the parent
  httpd process will assign all of the requests to the same child
  process if all of the children are idle).  But if you run in C<httpd
  -X> single server mode you will get C<2>, C<3>, C<4>, C<5>...
  (assuming that C<random()> returned C<1> at the first call)
  
  But do not get too obsessive with this mode, since working in single
  server mode sometimes hides problems that show up when you switch to
  normal (multi-server) mode.
  
  Consider an application that allows you to change the configuration at
  run time.  Let's say the script produces a form to change the
  background color of the page.  It's not good design, but for the sake
  of demonstrating the potential problem we will assume that our script
  doesn't write the changed background color to the disk, but simply
  changes it in memory, like this:
  
    use vars qw($bgcolor);
      # assign default value at first invocation
    $bgcolor ||= "white";
      # modify the color if requested to
    $bgcolor = $q->param('bgcolor') || $bgcolor;
  
  So you have typed in a new color, and in response, your script prints
  back the html with a new color - you think that's it!  It was so
  simple.  If you keep running in single server mode you will never
  notice that you have a problem...
  
  If you run the same code in normal server mode, after you submit the
  color change you will get the result as expected, but when you call
  the same URL again (not reload!) the chances are that you will get
  back the original default color (white in our case), since only the
  child which processed the color change request knows about the global
  variable change.  Just remember that children can't share information,
  other than that which they inherited from their parent on their birth.
  Of course you could use a hidden variable for the color to be
  remembered, or store it on the server side (database, shared memory,
  etc).
  
  If you use the Netscape client while your server is running in
  single-process mode, if the output returns HTML with C<E<lt>IMGE<gt>>
  tags, then the loading of the images will take a long time, since
  Netscape's C<KeepAlive> feature gets in the way.  Netscape tries to
  open multiple connections and keep them open.  Because there is only
  one server process listening, each connection has to time-out before
  the next succeeds.  Turn off C<KeepAlive> in I<httpd.conf> to avoid
  this effect.  Alternatively (assuming you use the image size
  parameters, so that Netscape will be able to render the rest of the
  page) you can press B<STOP> after a few seconds.
  
  In addition you should be aware that when running with C<-X> you will not
  see the status messages that the parent server normally writes to the
  error_log.  ("server started", "server stopped", etc.).  Since C<httpd
  -X> causes the server to handle all requests itself, without forking
  any children, there is no controlling parent to write the status
  messages.
  
  
  
  
  =head1 Apache::DumpHeaders - Watch HTTP Transaction Via Headers
  
  This module is used to watch an HTTP transaction, looking at client
  and servers headers.
  
  With C<Apache::ProxyPassThru> configured, you are able to watch your
  browser talk to any server besides the one with this module living
  inside.
  
  C<Apache::DumpHeaders> has the ability to filter on IP addresses, has
  an interface for other modules to decide if the headers should be
  dumped or not and a function to only dump I<n%> of the transactions.
  
  For more information read the module's manpage.
  
  Download the module from L<CPAN|download/CPAN_Downloads>.
  
  
  
  
  =head1 Apache::DebugInfo - Log Various Bits Of Per-Request Data
  
  C<Apache::DebugInfo> offers the ability to monitor various bits of
  per-request data.  Its functionality is similar to
  L<Apache::DumpHeaders|debug/Debugging_mod_perl_>
  while offering several additional features, including the ability to:
  
  =over
  
  =item - separate inbound from outbound HTTP headers
  
  =item - view the contents of $r-E<gt>notes and $r-E<gt>pnotes
  
  =item - view any of these at the various points in the request cycle
  
  =item - add output for any request phase from a single entry point
  
  =item - use as a PerlInitHandler or with direct method calls
  
  =item - use partial IP addresses for filtering by IP
  
  =item - offer a subclassable interface
  
  =back
  
  See the module's manpage for more details.
  
  
  
  
  
  
  
  
  
  =cut
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/download.pod
  
  Index: download.pod
  ===================================================================
  =head1 NAME
  
  Appendix A: Downloading software and documentation
  
  =head1 Coverage
  
  Here you will find instructions for downloading the software and
  its related documentation.
  
  =head1 Perl
  
  Perl is probably already installed on your machine, but you should at
  least check the version you are using.  It is highly recommended that
  you have at least Perl version 5.004.  You can get the latest perl
  version from http://www.perl.com/ .  Try the direct download link
  http://www.perl.com/pace/pub/perldocs/latest.html .  You can get Perl
  documentation from the same location (although copious documentation
  is included in the downloaded Perl distribution).
  
  =head1 CPAN Downloads
  
  You can download most of the Perl modules from CPAN.  There are many
  mirrors of this site.  The main site's URL is http://cpan.org/.
  
  You may want to search the Perl modules database by using
  http://search.cpan.org/.
  
  Either use the search form, or type in the name of the package the
  module is distributed in. For example if you are looking for
  C<Apache::DumpHeaders>, you can type:
  http://search.cpan.org/search?dist=Apache-DumpHeaders .
  
  
  =head1 Apache
  
  Get the latest Apache webserver and documentation from
  http://www.apache.org .  Try the direct download link
  http://www.apache.org/dist/ .
  
  =head1 mod_perl
  
  Get the latest mod_perl sources and documentation from
  http://perl.apache.org .  Try the direct download link
  http://perl.apache.org/dist/ .
  
  Source/Binary Distributions: http://perl.apache.org/distributions.html
  
  Every Apache project rolls a new tar.gz snapshot of the latest CVS
  version every 6 hours. You can grab the latest mod_perl CVS snapshot
  from http://perl.apache.org/snapshots/modperl/, all the mod_perl
  related projects are available from http://perl.apache.org/snapshots/.
  
  
  RPM: http://perl.apache.org/rpm/
  
  Debian users will find Perl, Apache and mod_perl are available 
  as .deb files on official image CDs or from the Debian web site 
  http://www.debian.org .  The Debian distribution also contains 
  many additional Perl and Apache libraries and modules. 
  
  =head1 Squid - Internet Object Cache
  
  http://squid.nlanr.net/
  
  Squid Linux 2.x Redhat RPMs : http://home.earthlink.net/~intrep/linux/
  
  =head1 thttpd - tiny/turbo/throttling HTTP server
  
  http://www.acme.com/software/thttpd/
  
  =head1 mod_throttle_access 
  
  http://www.fremen.org/apache/mod_throttle_access.html
  
  =head1 mod_proxy_add_forward
  
  Ask Bjoern Hansen has written the C<mod_proxy_add_forward.c> module
  for Apache that sets the C<X-Forwarded-For> field when doing a
  ProxyPass, similar to what Squid does.  His module is available from
  one of these URLs: http://modules.apache.org/search?id=124,
  http://develooper.com/code/mpaf/mod_proxy_add_forward.c or
  http://www.cpan.org/authors/id/ABH/mod_proxy_add_forward.c, complete
  with instructions on how to compile it and whatnot.
  
  
  =head1 httperf -- webserver Benchmarking tool
  
  http://www.hpl.hp.com/personal/David_Mosberger/httperf.html
  
  =head1 http_load -- another webserver Benchmarking tool
  
  http://www.acme.com/software/http_load/
  
  =head1 ab -- ApacheBench
  
  ApacheBench comes with the Apache distribution.
  
  =head1 Daquiri -- yet another webserver Benchmarking tool
  
  should be available from the mod_backhand CVS tree:
  http://www.backhand.org/mod_backhand/
  
  
  =head1 High-Availability and Load Balancing Projects
  
  =head2 mod_backhand -- Load Balancing for Apache
  
  http://www.backhand.org/mod_backhand/
  
  =head2 mod_redundancy
  
  mod_redundancy is a module that works with Apache webserver. It
  creates a Master/Slave Relationship between two physical
  webservers. The Slave takes over the IP-Address(es) and the
  Webservice(s) in case of a failure of the Master. One of the clues of
  this solution is, that the Redundancy/Failover-Configuration is made
  inside the Apache-Configfile.
  
  The product is neither OSS, nor free :(
  
  The homepage of mod_redundancy is http://www.ask-the-guru.com .
  
  
  =head2 High-Availability Linux Project
  
  You will find the definitive guide to load balancing techniques 
  at the High-Availability Linux Project site --
  http://www.henge.com/~alanr/ha/
  
  =head2 lbnamed - a Load Balancing Name Server Written in Perl
  
  http://www.stanford.edu/~riepel/lbnamed/
  http://www.stanford.edu/~riepel/lbnamed/bof.talk/
  http://www.stanford.edu/~schemers/docs/lbnamed/lbnamed.html
  
  =head2 Network Address Translation and Networks: Virtual Servers (Load 
Balancing)
  
  
http://www.csn.tu-chemnitz.de/~mha/linux-ip-nat/diplom/node4.html#SECTION00043100000000000000
  
  =head2 Linux Virtual Server Project
  
  http://www.linuxvirtualserver.org/
  
  =head2 Efficient Support for P-HTTP in Cluster-Based Web Servers
  
  (with Mohit Aron and Willy Zwaenepoel.) In Proceedings of the USENIX
  1999 Annual Technical Conference, Monterey, CA, June 1999.
  http://www.cs.rice.edu/~druschel/usenix99lard.ps.gz
  
http://www.usenix.org/publications/library/proceedings/usenix99/full_papers/aron/aron_html/index.html
  
  =head2 IP Filter
  
  The latest ip filter includes some simple load balancing code, that
  allows a round-robin distribution onto several machines via
  ipnat. That may be a simple solution for a few specific load problem.
  http://coombs.anu.edu.au/~avalon/ipf3.4beta3.tgz
  
  
  
  =head1 Apache::Request
  
  The package name is I<libapreq>.
  
  Get it from your favorite CPAN mirror at $CPAN/authors/id/DOUGM/ or
  from http://perl.apache.org/dist/.
  
  =head1 DataBases
  
  Low-Cost Unix Database Differences
  http://www.toodarkpark.org/computers/dbs.html
  
  My collection of various links to databases implementations
  http://stason.org/TULARC/webmaster/db.html
  
  =head1 libgtop
  
  LibGTop is a library that fetches system related information such as
  CPU Load, Memory Usage and information about running processes. The
  module C<GTop> provides a Perl interface to this library.
  
  http://home-of-linux.org/gnome/libgtop/
  
  
  =cut
  
  
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/frequent.pod
  
  Index: frequent.pod
  ===================================================================
  =head1 NAME
  
  Frequent mod_perl problems
  
  =head1 Coverage
  
  Some problems come up very often on the mailing list.  If there is
  some important problem that is being reported frequently on the list
  which is not included below, even if it is found elsewhere in the Guide,
  please L<tell me|help/Contacting_me>.
  
  
  =head1 my() scoped variable in nested subroutines
  
  See the section "L<my() Scoped Variable in Nested 
  Subroutines|perl/my_Scoped_Variable_in_Nested_S>".
  
  
  
  =head1 Segfaults caused by PerlFreshRestart
  
  See the section L<Evil things might happen when using
  PerlFreshRestart|troubleshooting/Evil_things_might_happen_when_us>
  
  
  =cut
  
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/hardware.pod
  
  Index: hardware.pod
  ===================================================================
  =head1 NAME
  
  Choosing an Operating System and Hardware
  
  =head1 Is it important?
  
  Before you use the techniques in this Guide to tune servers and write
  code you need to consider the demands which will be placed on the
  hardware and the operating system.  There is no point in investing a
  lot of time and money in configuration and coding only to find that
  your server's performance is poor because you did not choose a
  suitable platform in the first place.
  
  While the tips below could apply to many web servers, they are aimed
  primarily at administrators of mod_perl enabled Apache server.
  
  Because hardware platforms and operating systems are developing
  rapidly (even while you are reading this Guide), this discussion must
  be in general terms.
  
  =head1 Choosing an Operating System
  
  First let's talk about Operating Systems (OSs).
  
  Most of the time I prefer to use Linux or something from the *BSD
  family.  Although I am personally a Linux devotee, I do not want to
  start yet another OS war.
  
  I will try to talk about what characteristics and features you should
  be looking for to support an Apache/mod_perl server, then when you
  know what you want from your OS, you can go out and find it.  Visit
  the Web sites of the operating systems you are interested in.  You can
  gauge user's opinions by searching the relevant discussions in
  newsgroups and mailing list archives.  Deja - http://deja.com and
  eGroups - http://egroups.com are good examples.  I will leave this fan
  research to the reader.
  
  =head2 Stability and Robustness
  
  Probably the most important features in an OS are stability and
  robustness.  You are in an Internet business.  You do not keep normal
  9am to 5pm working hours like many conventional businesses you know.
  You are open 24 hours a day.  You cannot afford to be off-line, for
  your customers will go shop at another service like yours (unless you
  have a monopoly :).  If the OS of your choice crashes every day, first
  do a little investigation.  There might be a simple reason which you
  can find and fix.  There are OSs which won't work unless you reboot
  them twice a day.  You don't want to use the OS of this kind, no
  matter how good the OS' vendor sales department.  Do not follow flushy
  advertisements, follow developers advices instead.
  
  Generally, people who have used the OS for some time can tell you a
  lot about its stability.  Ask them.  Try to find people who are doing
  similar things to what you are planning to do, they may even be using
  the same software.  There are often compatibility issues to resolve.
  You may need to become familiar with patching and compiling your OS.
  It's easy.
  
  =head2 Memory Management
  
  You want an OS with a good memory management, some OSs are well known
  as memory hogs.  The same code can use twice as much memory on one OS
  compared to another.  If the size of the mod_perl process is 10Mb and
  you have tens of these running, it definitely adds up!
  
  =head2 Memory Leaks
  
  Some OSs and/or their libraries (e.g. C runtime libraries) suffer from
  memory leaks.  A leak is when some process requests a chunk of memory
  for temporary storage, but then does not subsequently release it.  The
  chunk of memory is not then available for any purpose until the
  process which requested it dies.  We cannot afford such leaks.  A
  single mod_perl process sometimes serves thousands of requests before
  it terminates.  So if a leak occurs on every request, the memory
  demands could become huge.  Of course our code can be the cause of the
  memory leaks as well (check out the C<Apache::Leak> module on CPAN).
  Certainly, we can reduce the number of requests to be served over the
  process' life, but that can degrade performance.
  
  =head2 Sharing Memory
  
  We want an OS with good memory sharing capabilities.  As we have seen,
  if we preload the modules and scripts at server startup, they are
  shared between the spawned children (at least for a part of a process'
  life - memory pages can become "dirty" and cease to be shared).  This
  feature can reduce memory consumption a lot!
  
  =head2 Cost and Support
  
  If we are in a big business we probably do not mind paying another
  $1000 for some fancy OS with bundled support.  But if our resources
  are low, we will look for cheaper and free OSs.  Free does not mean
  bad, it can be quite the opposite.  Free OSs can have the best support
  we can find.  Some do.  It is very easy to understand - most of the
  people are not rich and will try to use a cheaper or free OS first if
  it does the work for them.  Since it really fits their needs, many
  people keep using it and eventually know it well enough to be able to
  provide support for others in trouble.  Why would they do this for
  free?  One reason is for the spirit of the first days of the Internet,
  when there was no commercial Internet and people helped each other,
  because someone helped them in first place.  I was there, I was
  touched by that spirit and I am keen to keep that spirit alive.
  
  But, let's get back to our world.  We are living in material world,
  and our bosses pay us to keep the systems running.  So if you feel
  that you cannot provide the support yourself and you do not trust the
  available free resources, you must pay for an OS backed by a company,
  and blame them for any problem.  Your boss wants to be able to sue
  someone if the project has a problem caused by the external product
  that is being used in the project.  If you buy a product and the
  company selling it claims support, you have someone to sue or at least
  to put the blame on.
  
  If we go with Open Source and it fails we do not have someone to
  sue... wrong--in the last years many companies have realized how good
  the Open Source products are and started to provide an official
  support for these products.  So your boss cannot just dismiss your
  suggestion of using an Open Source Operating System.  You can get a
  paid support just like with any other commercial OS vendor.
  
  Also remember that the less money you spend on OS and Software, the
  more you will be able to spend on faster and stronger hardware.
  
  
  =head2 Discontinued Products
  
  The OSs in this hazard group tend to be developed by a single company
  or organization.
  
  You might find yourself in a position where you have invested a lot of
  time and money into developing some proprietary software that is
  bundled with the OS you chose (say writing a mod_perl handler which
  takes advantage of some proprietary features of the OS and which will
  not run on any other OS).  Things are under control, the performance
  is great and you sing with happiness on your way to work.  Then, one
  day, the company which supplies your beloved OS goes bankrupt (not
  unlikely nowadays), or they produce a newer incompatible version and
  they will not support the old one (happens all the time).  You are
  stuck with their early masterpiece, no support and no source code!
  What are you going to do?  Invest more money into porting the software
  to another OS...
  
  Everyone can be hit by this mini-disaster so it is better to check the
  background of the company when making your choice.  Even so you never
  know what will happen tomorrow - in 1980, a company called Tektronix
  did something similar to one of the Guide reviewers with its
  microprocessor development system.  The guy just had to buy another
  system.  He didn't buy it from Tektronix, of course.  The second
  system never really worked very well and the firm he bought it from
  went bust before they ever got around to fixing it.  So in 1982 he
  wrote his own microprocessor development system software.  It didn't
  take long, it works fine, and he's still using it 18 years later.
  
  Free and Open Source OSs are probably less susceptible to this kind of
  problem.  Development is usually distributed between many companies
  and developers, so if a person who developed a really important part
  of the kernel lost interest in continuing, someone else will pick the
  falling flag and carry on.  Of course if tomorrow some better project
  shows up, developers might migrate there and finally drop the
  development: but in practice people are often given support on older
  versions and helped to migrate to current versions.  Development tends
  to be more incremental than revolutionary, so upgrades are less
  traumatic, and there is usually plenty of notice of the forthcoming
  changes so that you have time to plan for them.
  
  Of course with the Open Source OSs you can have the source!  So you
  can always have a go yourself, but do not under-estimate the amounts
  of work involved.  There are many, many man-years of work in an OS.
  
  
  =head2 OS Releases
  
  Actively developed OSs generally try to keep pace with the latest
  technology developments, and continually optimize the kernel and other
  parts of the OS to become better and faster.  Nowadays, Internet and
  networking in general are the hottest topics for system developers.
  Sometimes a simple OS upgrade to the latest stable version can save
  you an expensive hardware upgrade.  Also, remember that when you buy
  new hardware, chances are that the latest software will make the most
  of it.
  
  If a new product supports an old one by virtue of backwards
  compatibility with previous products of the same family, you might not
  reap all the benefits of the new product's features.  Perhaps you get
  almost the same functionality for much less money if you were to buy
  an older model of the same product.
  
  
  
  
  
  
  
  
  =head1 Choosing Hardware
  
  Sometimes the most expensive machine is not the one which provides the
  best performance.  Your demands on the platform hardware are based on
  many aspects and affect many components.  Let's discuss some of them.
  
  In the discussion we use terms that may be unfamiliar to some readers:
  
  =over 4
  
  =item *
  
  Cluster - a group of machines connected together to perform one big or
  many small computational tasks in a reasonable time.  Clustering can
  also be used to provide 'fail-over' where if one machine fails its
  processes are transferred to another without interruption of service.
  And you may be able to take one of the machines down for maintenance
  (or an upgrade) and keep your service running - the main server will
  simply not dispatch the requests to the machine that was taken down.
  
  
  =item *
  
  Load balancing - users are given the name of one of your machines but
  perhaps it cannot stand the heavy load.  You can use a clustering
  approach to distribute the load over a number of machines.  The
  central server, which users access initially when they type the name
  of your service, works as a dispatcher.  It just redirects requests to
  other machines.  Sometimes the central server also collects the
  results and returns them to the users.  You can get the advantages of
  clustering too.
  
  There are many load balancing techniques. (See L<High-Availability
  Linux Project|download/High_Availability_Linux_Project> for more info.)
  
  =item *
  
  NIC - Network Interface Card. A hardware component that allows to
  connect your machine to the network. It performs packets sending and
  receiving, newer cards can encrypt and decrypt packets and perform
  digital signing and verifying of the such. These are coming in
  different speeds categories varying from 10Mbps to 10Gbps and
  faster. The most used type of the NIC card is the one that implements
  the Ethernet networking protocol.
  
  =item *
  
  RAM - Random Access Memory. It's the memory that you have in your
  computer. (Comes in units of 8Mb, 16Mb, 64Mb, 256Mb, etc.)
  
  =item *
  
  RAID - Redundant Array of Inexpensive Disks.
  
  An array of physical disks, usually treated by the operating system as
  one single disk, and often forced to appear that way by the hardware.
  The reason for using RAID is often simply to achieve a high data
  transfer rate, but it may also be to get adequate disk capacity or
  high reliability.  Redundancy means that the system is capable of
  continued operation even if a disk fails.  There are various types of
  RAID array and several different approaches to implementing them.
  Some systems provide protection against failure of more than one drive
  and some (`hot-swappable') systems allow a drive to be replaced
  without even stopping the OS.  See for example the Linux `HOWTO'
  documents Disk-HOWTO, Module-HOWTO and Parallel-Processing-HOWTO.
  
  =back
  
  =head2 Machine Strength Demands According to Expected Site Traffic
  
  If you are building a fan site and you want to amaze your friends with
  a mod_perl guest book, any old 486 machine could do it.  If you are in
  a serious business, it is very important to build a scalable server.
  If your service is successful and becomes popular, the traffic could
  double every few days, and you should be ready to add more resources
  to keep up with the demand.  While we can define the webserver
  scalability more precisely, the important thing is to make sure that
  you can add more power to your webserver(s) without investing much
  additional money in software development (you will need a little
  software effort to connect your servers, if you add more of them).
  This means that you should choose hardware and OSs that can talk to
  other machines and become a part of a cluster.
  
  On the other hand if you prepare for a lot of traffic and buy a
  monster to do the work for you, what happens if your service doesn't
  prove to be as successful as you thought it would be?  Then you've
  spent too much money, and meanwhile faster processors and other
  hardware components have been released, so you lose.
  
  Wisdom and prophecy, that's all it takes :)
  
  
  =head3 Single Strong Machine vs Many Weaker Machines
  
  Let's start with a claim that a four years old processor is still very
  powerful and can be put to a good use. Now let's say that for a given
  amount of money you can probably buy either one new very strong
  machine or about ten older but very cheap machines. I claim that with
  ten old machines connected into a cluster and by deploying load
  balancing you will be able to serve about five times more requests
  than with one single new machine.
  
  Why is that?  Because generally the performance improvement on a new
  machine is marginal while the price is much higher.  Ten machines will
  do faster disk I/O than one single machine, even if the new disk is
  quite a bit faster.  Yes, you have more administration overhead, but
  there is a chance you will have it anyway, for in a short time the new
  machine you have just bought might not stand the load.  Then you will
  have to purchase more equipment and think about how to implement load
  balancing and web server file system distribution anyway.
  
  Why I'm so convinced?  Look at the busiest services on the Internet:
  search engines, web-email servers and the like -- most of them use a
  clustering approach.  You may not always notice it, because they hide
  the real implementation behind proxy servers.
  
  =head2 Internet Connection
  
  You have the best hardware you can get, but the service is still
  crawling.  Make sure you have a fast Internet connection.  Not as fast
  as your ISP claims it to be, but fast as it should be.  The ISP might
  have a very good connection to the Internet, but put many clients on
  the same line.  If these are heavy clients, your traffic will have to
  share the same line and your throughput will suffer.  Think about a
  dedicated connection and make sure it is truly dedicated.  Don't trust
  the ISP, check it!
  
  The idea of having a connection to B<The Internet> is a little
  misleading.  Many Web hosting and co-location companies have large
  amounts of bandwidth, but still have poor connectivity.  The public
  exchanges, such as MAE-East and MAE-West, frequently become
  overloaded, yet many ISPs depend on these exchanges.
  
  Private peering means that providers can exchange traffic much
  quicker.
  
  Also, if your Web site is of global interest, check that the ISP has
  good global connectivity.  If the Web site is going to be visited
  mostly by people in a certain country or region, your server should
  probably be located there.
  
  Bad connectivity can directly influence your machine's performance.
  Here is a story one of the developers told on the mod_perl mailing
  list:
  
    What relationship has 10% packet loss on one upstream provider got
    to do with machine memory ?
  
    Yes.. a lot. For a nightmare week, the box was located downstream of
    a provider who was struggling with some serious bandwidth problems
    of his own... people were connecting to the site via this link, and
    packet loss was such that retransmits and tcp stalls were keeping
    httpd heavies around for much longer than normal.. instead of
    blasting out the data at high or even modem speeds, they would be
    stuck at 1k/sec or stalled out...  people would press stop and
    refresh, httpds would take 300 seconds to timeout on writes to
    no-one.. it was a nightmare.  Those problems didn't go away till I
    moved the box to a place closer to some decent backbones.
  
    Note that with a proxy, this only keeps a lightweight httpd tied up,
    assuming the page is small enough to fit in the buffers.  If you are
    a busy internet site you always have some slow clients.  This is a
    difficult thing to simulate in benchmark testing, though.
  
  
  =head2 I/O Performance
  
  If your service is I/O bound (does a lot of read/write operations to
  disk) you need a very fast disk, especially if the you need a
  relational database, which are the main I/O stream creators.  So you
  should not spend the money on Video card and monitor!  A cheap card
  and a 14" monochrome monitor are perfectly adequate for a Web server,
  you will probably access it by C<telnet> or C<ssh> most of the time.
  Look for disks with the best price/performance ratio.  Of course, ask
  around and avoid disks that have a reputation for headcrashes and
  other disasters.
  
  You must think about RAID or similar systems if you have an enormous
  data set to serve (what is an enormous data set nowadays?  Gigabytes,
  Terabytes?) or you expect a really big web traffic.
  
  Ok, you have a fast disk, what's next?  You need a fast disk
  controller.  There may be one embedded on your computer's motherboard.
  If the controller is not fast enough you should buy a faster one.
  Don't forget that it may be necessary to disable the original
  controller.
  
  =head2 Memory
  
  Memory should be well tested.  Many memory test programs are
  practically useless.  Running a busy system for a few weeks without
  ever shutting it down is a pretty good memory test.  If you increase
  the amount of RAM on a well-tested box, use well-tested RAM.
  
  How much RAM do you need?  Nowadays, the chances are that you will
  hear: "Memory is cheap, the more you buy the better".  But how much is
  enough?  The answer is pretty straightforward: I<you do not want your
  machine to swap>.  When the CPU needs to write something into memory,
  but memory is already full, it takes the least frequently used memory
  pages and swaps them out to disk.  This means you have to bear the
  time penalty of writing the data to disk.  If another process then
  references some of the data which happens to be on one of the pages
  that has just been swapped out, the CPU swaps it back in again,
  probably swapping out some other data that will be needed very shortly
  by some other process.  Carried to the extreme, the CPU and disk start
  to I<thrash> hopelessly in circles, without getting any real work
  done.  The less RAM there is, the more often this scenario arises.
  Worse, you can exhaust swap space as well, and then your troubles
  really start...
  
  How do you make a decision?  You know the highest rate at which your
  server expects to serve pages and how long it takes on average to
  serve one.  Now you can calculate how many server processes you need.
  If you know the maximum size your servers can grow to, you know how
  much memory you need.  If your OS supports L<memory
  sharing|hardware/Sharing_Memory>, you can make best use of this
  feature by preloading the modules and scripts at server startup, and
  so you will need less memory than you have calculated.
  
  Do not forget that other essential system processes need memory as
  well, so you should plan not only for the Web server, but also take
  into account the other players.  Remember that requests can be queued,
  so you can afford to let your client wait for a few moments until a
  server is available to serve it.  Most of the time your server will
  not have the maximum load, but you should be ready to bear the peaks.
  You need to reserve at least 20% of free memory for peak situations.
  Many sites have crashed a few moments after a big scoop about them was
  posted and an unexpected number of requests suddenly came in.  (This
  is called the Slashdot effect, which was born at http://slashdot.org ).
  If you are about to announce something cool, be aware of the possible
  consequences.
  
  =head2 CPU
  
  Make sure that the CPU is operating within its specifications.  Many
  boxes are shipped with incorrect settings for CPU clock speed, power
  supply voltage etc.  Sometimes a cooling fan is not fitted.  It may be
  ineffective because a cable assembly fouls the fan blades.  Like
  faulty RAM, an overheating processor can cause all kinds of strange
  and unpredictable things to happen.  Some CPUs are known to have bugs
  which can be serious in certain circumstances.  Try not to get one of
  them.
  
  =head2 Bottlenecks
  
  You might use the most expensive components, but still get bad
  performance.  Why?  Let me introduce an annoying word: bottleneck.
  
  A machine is an aggregate of many components.  Almost any one of them
  may become a bottleneck.
  
  If you have a fast processor but a small amount of RAM, the RAM will
  probably be the bottleneck.  The processor will be under-utilized,
  usually it will be waiting for the kernel to swap the memory pages in
  and out, because memory is too small to hold the busiest pages.
  
  If you have a lot of memory, a fast processor, a fast disk, but a slow
  disk controller, the disk controller will be the bottleneck.  The
  performance will still be bad, and you will have wasted money.
  
  Use a fast NIC that does not create a bottleneck.  They are cheap.  If
  the NIC is slow, the whole service is slow.  This is a most important
  component, since webservers are much more often network-bound than
  they are disk-bound!
  
  =head3 Solving Hardware Requirement Conflicts
  
  It may happen that the combination of software components which you
  find yourself using gives rise to conflicting requirements for the
  optimization of tuning parameters.  If you can separate the components
  onto different machines you may find that this approach (a kind of
  clustering) solves the problem, at much less cost than buying faster
  hardware, because you can tune the machines individually to suit the
  tasks they should perform.
  
  For example if you need to run a relational database engine and
  mod_perl server, it can be wise to put the two on different machines,
  since while RDBMS need a very fast disk, mod_perl processes need lots
  of memory. So by placing the two on different machines it's easy to
  optimize each machine at separate and satisfy the each software
  components requirements in the best way.
  
  =head2 Conclusion
  
  To use your money optimally you have to understand the hardware very
  well, so you will know what to pick.  Otherwise, you should hire a
  knowledgeable hardware consultant and employ them on a regular basis,
  since your needs will probably change as time goes by and your
  hardware will likewise be forced to adapt as well.
  
  =cut
  
  
  
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/help.pod
  
  Index: help.pod
  ===================================================================
  =head1 NAME
  
  Getting Help and Further Learning
  
  =head1 READ ME FIRST
  
  If, after reading this guide and the other documents listed in this
  section, you still don't have the answers/information needed, please
  ask for help on the Apache/mod_perl mailing list.  But please, first 
  try to browse the mailing list archive.  Most of the time you will 
  find the answer to your questions by searching the archive, since it is 
  very likely that someone else has already encountered the same problem
  and found a solution for it.  If you ignore this advice, you should not
  be surprised if your question is left unanswered - it bores people to
  answer the same question more than once.  This does not mean that you
  should avoid asking questions, but you should not abuse the available
  help and you should I<RTFM> before you call for I<HELP>.  (Remember the
  fable of the shepherd boy and the wolves).
  
  For more information See L<Get helped with 
mod_perl|help/Get_help_with_mod_perl>.
  
  
  
  
  
  
  =head1 Contacting me
  
  Hi, I wrote this document to help people with mod_perl.  It does not
  mean that if you have any question regarding mod_perl, perl or
  whatever you think I might know, you should send it directly to me.
  Please see the L<Get help with mod_perl|help/Get_help_with_mod_perl>
  section and follow the guidelines described there.
  
  However, you are welcome to submit corrections and suggestions
  directly to me at
  [EMAIL PROTECTED]  If you are
  going to submit heavy corrections of the text (I love those!), please
  help me by downloading the source pages in POD from
  http://www.stason.org/guide-snapshots/, directly editing them and
  sending them to me.  I will use Emacs Ediff to perform an easy merge
  of such changes.  Thank you!
  
  I<PLEASE DO NOT SEND QUESTIONS DIRECTLY TO ME, I didn't invite those
  by writing the guide.  They will all be immediately deleted.  Please
  ask questions on the mod_perl list and if we can answer your question,
  one (or more) of us will answer it on the list.  Thank you!>
  
  
  
  =head2 How to Report Problems
  
  Always send these details:
  
  =over
  
  =item * Anything in the I<error_log> file that looks suspicious and
  possibly related to the problem.
  
  =item * Output of C<perl -V>
  
  =item * Version of mod_perl
  
  =item * Version of apache
  
  =item * Options given to mod_perl's Makefile.PL
  
  =item * Server configuration details
  
  =item * Relevant sections of your ErrorLog (make test's is: t/logs/error_log)
  
  =item * If 'make test' fails, the output of 'make test TEST_VERBOSE=1'
  
  =back
  
  Also check whether:
  
  =over
  
  =item * 'make test' passes 100%?
  
  =item * the script works under mod_cgi if applicable
  
  =back
  
  You should try to isolate the problem and send the smallest possible
  code snippet, that reproduces the problem. If the issue is unique and
  we cannot reproduce the problem, it's hard to solve it.
  
  =over
  
  =item Getting the Backtrace From Core Dumps
  
  If you get a I<core> file dump (I<Segmentation fault>), please send a
  backtrace if possible.  Before you try to produce it, re-build
  mod_perl with:
  
    panic% perl Makefile.PL PERL_DEBUG=1
  
  which will:
  
  =over
  
  =item * add C<-g> to C<EXTRA_CFLAGS>
  
  =item * turn on C<PERL_TRACE>
  
  =item * set C<PERL_DESTRUCT_LEVEL=2> (additional checks during Perl cleanup)
  
  =item * link against I<libperld> if it exists
  
  =back
  
  Here is a summary of how to get a backtrace:
  
    panic% cd mod_perl-x.xx
    panic% touch t/conf/srm.conf
    panic% gdb ../apache_x.xx/src/httpd
    (gdb) run -X -f `pwd`/t/conf/httpd.conf -d `pwd`/t
    [now make request that causes core dump]
    (gdb) bt
  
  So you go to the mod_perl source directory, create an empty
  I<srm.conf> file, and start gdb with a path to the httpd binary, which
  is at least located in the Apache source tree after you built it. (Of
  course replace I<x> with real version numbers). Next step is to start
  the httpd from within gdb and issue a request, which causes a core
  dump. when the code has died with SEGV sygnal, run I<bt> to get the
  backtrace.
  
  Alternatively you can also attach to an already running process like
  so:
  
    panic% gdb httpd <process id number>
  
  If the dump is happening in I<libperl> you have to rebuild Perl with
  C<-DDEBUGGING> enabled. A quick way to this is to go to your Perl
  source tree and run these commands:
  
    panic% rm *.[oa]
    panic% make LIBPERL=libperld.a
    panic% cp libperld.a $Config{archlibexp}/CORE
  
  where C<$Config{archlibexp}> is:
  
    % perl -V:archlibexp
  
  
  =item Spinning Processes
  
  The gdb attaching to the live process approach is helpful when
  debugging a I<spinning> process.  You can also get a Perl stacktrace
  of a I<spinning> process by install a C<$SIG{USR1}> handler in your
  code:
  
    use Carp ();
    $SIG{USR1} = \&Carp::confess;
  
  While the process is spinning, send it a I<USR1> signal:
  
    panic% kill -USR1 <process id number>
  
  and the Perl stack trace will be printed.
  
  alternatively you can use gdb to find which Perl code is causing the
  spin:
  
    panic% gdb httpd <pid of spinning process>
    (gdb) where
    (gdb) source mod_perl-x.xx/.gdbinit
    (gdb) curinfo
  
  After loading the special macros file (I<.gdbinit>) you can use the
  I<curinfo> gdb macro to figure out the file and line number the code
  stuck in.
  
  =back
  
  Finally send all these details to the modperl mailing list.
  
  
  
  
  
  
  =head1 Get help with mod_perl
  
  =over
  
  =item * mod_perl home
  
  http://perl.apache.org
  
  =item * News and Resources
  
  Take23: News and Resources for the mod_perl world 
  http://take23.org
  
  =item * mod_perl Books
  
  =over
  
  =item * Writing Apache Modules with Perl and C
  
  http://www.modperl.com is the home site of The Apache Modules Book, a book
  about creating Web server modules using the Apache API, written by
  Lincoln Stein and Doug MacEachern.
  
  The book should be available from your local bookstore or from your
  favorite on-line bookseller.  O'Reilly lists this book as:
  
    Writing Apache Modules with Perl and C
    By Lincoln Stein & Doug MacEachern
    1st Edition March 1999
    2nd Edition Feb 2000
    1-56592-567-X, Order Number: 567X
    746 pages, $34.95
  
  
  =item * The mod_perl Developer's Cookbook
  
  Home site: http://www.modperlcookbook.org
  
  SAMS lists this book as:
  
    The mod_perl Developer's Cookbook
    By Geoffrey Young; Paul Lindner; Randy Kobes
    ISBN: 0672322404
    Pub. Date: Jan 17,2002
    Price: $39.99
    Pages: 600
  
  
  =item * Managing and Programming mod_perl
  
  http://www.modperlbook.com is the home site of the new mod_perl book,
  that Eric Cholet and Stas Bekman are co-authoring.  We expect
  the book to be published in 2002.
  
  Ideas, suggestions and comments are welcome. Please send them to
  [EMAIL PROTECTED] .
  
  =item * mod_perl Quick Reference Card
  
  I<mod_perl Pocket Reference> by Andrew Ford was published by O'Reilly
  and Associates http://www.oreilly.com/catalog/modperlpr/
  
  You should probably get also the I<Apache Pocket Reference> by the
  same author and the same publisher:
  http://www.oreilly.com/catalog/apachepr/
  
  See also Andrew's collection of reference card for Apache and other
  programs: http://www.refcards.com.
  
  
  
  =back
  
  =item * mod_perl Guide
  
  by Stas Bekman at http://perl.apache.org/guide
  
  =item * mod_perl FAQ
  
  by Frank Cringle at http://perl.apache.org/faq/ .
  
  =item * mod_perl performance tuning guide 
  
  by Vivek Khera at http://perl.apache.org/tuning/ .
  
  =item * mod_perl plugin reference guide 
  
  by Doug MacEachern at http://perl.apache.org/src/mod_perl.html .
  
  =item * Quick guide for moving from CGI to mod_perl
  
  at http://perl.apache.org/dist/cgi_to_mod_perl.html .
  
  =item * mod_perl_traps, common traps and solutions for mod_perl users
  
  at http://perl.apache.org/dist/mod_perl_traps.html .
  
  
  =item * Articles
  
  =over
  
  =item * PerlMonth
  
  http://perlmonth.com
  
  =item * ApacheToday
  
  http://apachetoday.com
  
  =item * Basic knowledge about Apache stages and mod_perl handlers
  article in German
  
  http://www.heise.de/ix/artikel/2000/01/156/
  
  =back
  
  
  =item * mod_perl mailing lists
  
  =over
  
  =item * The mod_perl mailing list
  
  The Apache/Perl mailing list I<is available for mod_perl users and
  developers to share ideas, solve problems and discuss things related
  to mod_perl and the Apache::* modules.>  To subscribe to this list,
  send email to [EMAIL PROTECTED] .  To unsubscribe send
  email to [EMAIL PROTECTED] . 
  
  To subscribe to the digest list send email to
  [EMAIL PROTECTED] .
  
  Searchable mailing list archives are available:
  
  =over
  
  =item *
  
  http://mathforum.org/epigone/modperl
  
  =item *
  
  http://www.geocrawler.com/lists/3/web/182/0/
  
  =item *
  
  http://www.mail-archive.com/modperl%40apache.org/
  
  =item *
  
  http://www.davin.ottawa.on.ca/archive/modperl/
  
  =item *
  
  http://marc.theaimsgroup.com/?l=apache-modperl
  
  =item *
  
  http://www.egroups.com/group/modperl/
  
  =back
  
  =item * The mod_perl development mailing list
  
  This list is for discussions about the development of the core
  mod_perl.
  
  Subscription information.
  
  To subscribe to this list send an empty email to
  [EMAIL PROTECTED]  To unsubscribe from the list send an
  empty email to [EMAIL PROTECTED]  To get help with the
  list send an empty email to [EMAIL PROTECTED]
  
  List's searchable archives:
  
  =over
  
  =item *
  
  http://marc.theaimsgroup.com/?l=apache-modperl-dev&r=1&w=2#apache-modperl-dev
  
  =item *
  
  http://www.mail-archive.com/dev%40perl.apache.org/
  
  =back
  
  =item * The mod_perl documentation mailing list
  
  This mailing list is for discussing mod_perl documentation.
  
  Subscription information
  
    Subscribe to the list: [EMAIL PROTECTED]
    Unsubscribe from the list: [EMAIL PROTECTED]
    Get help with the list: [EMAIL PROTECTED]
  
  =item * The Apache test framework development mailing list
  
  The test-dev list is the list where Apache HTTP Test project is
  discussed.
  
  Subscription information:
  
    Subscribe to the list: [EMAIL PROTECTED]
    Unsubscribe from the list: [EMAIL PROTECTED]
    Get help with the list: [EMAIL PROTECTED]
  
  List's searchable archive:
  
  http://www.apachelabs.org/test-dev/
  
  
  
  =item * The advocacy mailing list
  
  The list for mod_perl advocacy issues, discussions about
  sites, etc.
  
  Subscribe by sending a mail to [EMAIL PROTECTED]
  Unsubscribe by sending a mail to [EMAIL PROTECTED]
  Use [email protected] to post to the list.
  
  The archive: http://www.mail-archive.com/[email protected]/.
  
  
  =item * The modperl-cvs mailing list
  
  The modperl developers list is the list where you can watch mod_perl
  getting patched.  No real discussions happen on this list, but if you
  want to know about the latest changes in the mod_perl core before
  everyone else, this is a list to be on.
  
  To subscribe to this list, send email to
  [EMAIL PROTECTED] .  To unsubscribe send email to
  [EMAIL PROTECTED] .  Send email to
  [email protected] to post to the list.
  
  No archives available.
  
  
  
  =back
  
  
  =back
  
  
  
  
  
  
  
  
  
  
  
  =head1 Get help with Perl
  
  =over
  
  =item * The Perl FAQ
  
  http://www.perl.com/CPAN/doc/FAQs/FAQ/PerlFAQ.html
  
  =item * The Perl Home Page
  
  http://www.perl.com/
  
  =item * The Perl Journal
  
  http://www.tpj.com/
  
  =item * Perl Module Mechanics
  
  http://world.std.com/~swmcd/steven/perl/module_mechanics.html - This
  page describes the mechanics of creating, compiling, releasing and
  maintaining Perl modules.
  
  =item * Creating (and Maintaining) Perl Modules
  
  http://www.mathforum.com/~ken/perl_modules.html
  
  =item * XS tutorials
  
  Perl manpages: I<perlguts>, I<perlxs>, and I<perlxstut> manpages.
  
  Dean Roehrich's XS CookBookA and CookBookB
  http://search.cpan.org/search?dist=CookBookA
  http://search.cpan.org/search?dist=CookBookB
  
  a series of articles at PerlMonth.com by Steven McDougall:
  http://www.perlmonth.com/columns/modules/modules.html?issue=6
  http://www.perlmonth.com/columns/modules/modules.html?issue=7
  http://www.perlmonth.com/columns/modules/modules.html?issue=8
  http://www.perlmonth.com/columns/modules/modules.html?issue=9
  http://www.perlmonth.com/columns/modules/modules.html?issue=10
  
  I<Advanced Perl Programming> By Sriram Srinivasan. Published by
  O'Reilly & Associates. ISBN: 1-56592-220-4. Chapters 18-20.
  
  perl-xs mailing list on perl.org (mail [EMAIL PROTECTED])
  
  Take a look also at Inline.pm (CPAN) and SWIG (http://www.swig.org/)
  
  =item * Perl mailing lists
  
  A huge number of diverse Perl mailing lists is listed at
  http://lists.perl.org/.
  
  =item * perl5-porters mailing list
  
  Send an email:
  
  =over
  
  =item * 
  
  to [EMAIL PROTECTED] to subscribe to this list.
  
  =item * 
  
  to [EMAIL PROTECTED] to unsubscribe to this list.
  
  =item * 
  
  to [EMAIL PROTECTED] if you prefer a digest
  version.
  
  =item * 
  
  to [EMAIL PROTECTED] to unsubscribe from the
  digest.
  
  =item * 
  
  to [EMAIL PROTECTED] for more information about the
  subscription options.
  
  =back
  
  See also http://tile.net/listserv/perl5portersdigest.html and
  http://tile.net/lists/perl5porters.html .
  
  List's archive is available at
  http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/.
  
  
  =back
  
  
  
  
  
  
  
  
  
  
  
  
  =head1 Get help with Perl/CGI
  
  =over
  
  =item * Perl/CGI FAQ
  
  at http://www.perl.com/CPAN/doc/FAQs/cgi/perl-cgi-faq.html
  
  =item * Answers to some troublesome Perl and Perl/CGI questions
  
  http://stason.org/TULARC/webmaster/myfaq.html
  
  =item * Idiot's Guide to CGI programming
  
  http://www.perl.com/CPAN/doc/FAQs/cgi/idiots-guide.html
  
  =item * WWW Security FAQ
  
  http://www.w3.org/Security/Faq/www-security-faq.html
  
  =item * CGI/Perl Taint Mode FAQ
  
  http://www.gunther.web66.com/FAQS/taintmode.html (by Gunther
  Birznieks)
  
  =item * cgi-list mailing list
  
  Send email to [EMAIL PROTECTED] with body:
  
    subscribe cgi-list
  
  
  =back
  
  
  
  
  
  
  
  
  
  =head1 Get help with Apache 
  
  =over
  
  =item * Apache Project's Home
  
  http://www.apache.org
  
  =item * Apache Mailing Lists
  
  You will find a comprehensive list of all Apache projects' mailing
  lists at: http://www.apache.org/foundation/mailinglists.html
  
  =item * Apache Quick Reference Card
  
  http://www.refcards.com (other reference cards are also available 
  from this link)
  
  =item * The Apache FAQ
  
  http://www.apache.org/docs/misc/FAQ.html
  
  =item * Apache Server Documentation
  
  http://www.apache.org/docs/
  
  =item * Apache Handlers
  
  http://www.apache.org/docs/handler.html
  
  =item * mod_rewrite Guide
  
  http://www.engelschall.com/pw/apache/rewriteguide/
  
  =item * articles
  
  Security and Apache: An Essential Primer
  http://linuxplanet.com/linuxplanet/print/1527/
  
  Using Apache with Suexec on Linux
  http://linuxplanet.com/linuxplanet/print/1445/
  
  Installing and Securing the Apache Webserver with SSL
  
http://www.securityfocus.com/focus/sun/articles/apache-inst.html?&_ref=1653102939
  
  
  =item * mod_throttle_access
  
  mod_throttle_access is available at http://www.fremen.org/apache/
  
  
  =back
  
  
  
  
  
  
  
  
  
  
  =head1 Get help with DBI and SQL
  
  =over
  
  =item *
  
  Introduction to Structured Query Language:
  http://w3.one.net/~jhoffman/sqltut.htm
  
  =item *
  
  SQL for Web Nerds by Philip Greenspun
  http://www.arsdigita.com/books/sql/
  
  =item * Jeffrey Baker's DBI Examples and Performance Tuning
  
  http://www.saturn5.com/~jwb/dbi-examples.html (by Jeffrey William
  Baker).
  
  =item * DBI Homepage
  
  http://www.symbolstone.org/technology/perl/DBI/
  
  =item * DBI mailing list information
  
  http://www.fugue.com/dbi/ 
  
  =item * DBI mailing list archives
  
  http://outside.organic.com/mail-archives/dbi-users/
  http://www.xray.mpe.mpg.de/mailing-lists/dbi/
  
  =back
  
  
  
  
  
  
  
  
  =head1 Get help with Squid - Internet Object Cache
  
  =over 4
  
  =item *
  
  Home page - http://squid.nlanr.net/
  
  =item *
  
  FAQ - http://squid.nlanr.net/Squid/FAQ/FAQ.html
  
  =item *
  
  Users Guide - http://squid.nlanr.net/Squid/Users-Guide/
  
  =item *
  
  Mailing lists - http://squid.nlanr.net/Squid/mailing-lists.html
  
  =back
  
  
  
  
  
  =head1 Get help with CVS -- Concurrent Version Control
  
  =over
  
  
  =item *
  
  mod_perl repository specific doc: mod_perl_cvs.pod, located in the
  root of the mod_perl source distribution and online at
  http://perl.apache.org/mod_perl_cvs.html
  
  =item *
  
  Open Source Development with CVS
  http://cvsbook.red-bean.com/
  
  =item *
  
  Online Documents
  http://www.sourcegear.com/CVS/Docs/online
  
  =item *
  
  CVS Quick Reference
  http://www.sourcegear.com/CVS/Docs/ref
  
  =item *
  
  CVS Books
  http://www.sourcegear.com/CVS/Docs/books
  
  =item *
  
  User-Written FAQ
  http://www.sourcegear.com/CVS/Docs/docfaq
  
  =item *
  
  Introduction to CVS
  http://www.sourcegear.com/CVS/Docs/blandy
  
  
  =back
  
  
  =head1 Get help with Performance and Scalability
  
  =over
  
  =item * Techniques and Technologies for Scaling Internet Services mailing 
list.
  
  The list's address: [EMAIL PROTECTED]
  Subscribe by sending a message to [EMAIL PROTECTED]
  Archive: http://archive.develooper.com/[EMAIL PROTECTED]/
  
  =item *
  
  I<Solaris 2.x - Tuning Your TCP/IP Stack and More>
  http://www.sean.de/Solaris/tune.html
  
  This page talks about TCP/IP stack and various tricks of tuning your
  system to get the most out of it as a web server. While the
  information is for Solaris 2.x OS, most of it will be relevant of
  other Unix flavors. At the end an extensive list of related literature
  is presented.
  
  =item *
  
  High-Availability Linux Project
  http://www.linux-ha.org/
  
  =back
  
  
  
  =head1 Get help with Unix OS flavors -- Unix OS related resources
  
  =over
  
  =item * Memory system management and architecture:
  
  The Solaris memory system, sizing, tools and architecture:
  http://www.sun.com/solutions/third-party/global/SAS/pdf/vmsizing.pdf
  
  =item * 
  
  
  =back
  
  
  
  
  
  =cut
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/install.pod
  
  Index: install.pod
  ===================================================================
  =head1 NAME
  
  mod_perl Installation
  
  =head1 A Summary of a Basic mod_perl Installation
  
  The following 10 commands summarize the execution steps required to
  build and install a basic mod_perl enabled Apache server on almost any
  standard flavor of Unix OS.
  
    % cd /usr/src
    % lwp-download http://www.apache.org/dist/apache_x.x.x.tar.gz
    % lwp-download http://perl.apache.org/dist/mod_perl-x.xx.tar.gz
    % tar xzvf apache_x.x.x.tar.gz
    % tar xzvf mod_perl-x.xx.tar.gz
    % cd mod_perl-x.xx
    % perl Makefile.PL APACHE_SRC=../apache_x.x.x/src \
      DO_HTTPD=1 USE_APACI=1 EVERYTHING=1
    % make && make test && make install
    % cd ../apache_x.x.x
    % make install
  
  Of course you should replace I<x.xx> and I<x.x.x> with the real
  version numbers of mod_perl and Apache.
  
  All that's left is to add a few configuration lines to C<httpd.conf>,
  the Apache configuration file, start the server and enjoy mod_perl.
  
  If you have stumbled upon a problem at any of the above steps, don't
  despair, the next sections will explain in detail each and every step.
  
  Of course there is a way of installing mod_perl in only a couple of
  minutes if you are using a Linux distribution that uses RPM packages:
  
    % rpm -i apache-xx.xx.rpm
    % rpm -i mod_perl-xx.xx.rpm
  
  or apt system:
  
    % apt-get install libapache-mod-perl
  
  These should set up both Apache and mod_perl correctly for your
  system. Using a packaged distribution can make installing and
  reinstalling a lot quicker and easier. (Note that the filenames will
  vary, and I<xx.xx> will differ.)
  
  Since mod_perl can be configured in many different ways (features can
  be enabled or disabled, directories can be modified, etc.) it's
  preferable to use a manual installation, as a prepackaged version
  might not suit your needs. Manual installation will allow you to make
  the fine tuning for the best performance as well.
  
  In this chapter we will talk extensively about the prepackaged
  versions, and ways to prepare your own packages for reuse on many
  machines.
  
  =head1 The Gory Details
  
  We saw that the basic mod_perl installation is quite simple and takes
  about 10 commands.  You can copy and paste them from these pages.  The
  parameter C<EVERYTHING=1> selects a lot of options, but sometimes you
  will need different ones.  You may need to pass only specific
  parameters, to bundle other components with mod_perl etc.  You may
  want to build mod_perl as a loadable object instead of compiling it
  into Apache, so that it can be upgraded without rebuilding Apache
  itself.
  
  To accomplish this you will need to understand various techniques for
  mod_perl configuration and building.  You need to know what
  configuration parameters are available to you and when and how to use
  them.
  
  As with Perl, with mod_perl simple things are simple.  But when you
  need to accomplish more complicated tasks you may have to invest some
  time to gain a deeper understanding of the process.  In this chapter I
  will take the following route.  I'll start with a detailed explanation
  of the four stages of the mod_perl installation process, then continue
  with the different paths each installation might take according to
  your goal, followed by a few copy-and-paste real world installation
  scenarios.  Towards the end of the chapter I will show you various
  approaches that make the installations easier, automating most of the
  steps.  Finally I'll cover some of the general issues that can cause
  new users to stumble while installing mod_perl.
  
  We can clearly separate the installation process into the following
  stages:
  
  =over 4
  
  =item * Source Configuration,
  
  =item * Building,
  
  =item * Testing and
  
  =item * Installation.
  
  =back
  
  =head2 Source Configuration (perl Makefile.PL ...)
  
  Before building and installing mod_perl you have to configure it.  You
  configure mod_perl just like any other Perl module:
  
    % perl Makefile.PL [parameters]
  
  In this section we will go through most of the parameters mod_perl can
  accept and explain each one of them.
  
  First let's see what configuration mechanisms we have available.
  Basically they all define a special set of parameters to be passed to
  C<perl Makefile.PL>.  Depending on the chosen configuration, the final
  product might be a stand-alone httpd binary or a loadable object.
  
  The source configuration mechanism in Apache 1.3 provides four major
  features (which of course are available to mod_perl):
  
  =over 4
  
  =item Per-module configuration scripts (ConfigStart/End)
  
  This is a mechanism modules can use to link themselves into the
  configuration process.  It is useful for automatically adjusting the
  configuration and build parameters from the modules sources.  It is
  triggered by C<ConfigStart>/C<ConfigEnd> sections inside
  I<modulename.module> files (e.g. I<libperl.module>).
  
  =item Apache Autoconf-style Interface (APACI)
  
  This is the new top-level C<configure> script from Apache 1.3 which
  provides a GNU Autoconf-style interface.  It is useful for configuring
  the source tree without manually editing any I<src/Configuration>
  files.  Any parameterization can be done via command line options to
  the C<configure> script.  Internally this is just a nifty wrapper to
  the old C<src/Configure> script.
  
  Since Apache 1.3 this is the way to install mod_perl as cleanly as
  possible.  Currently this is a pure Unix-based solution because at
  present the complete Apache 1.3 source configuration mechanism is only
  available under Unix.  It doesn't work on the Win32 platform for
  example.
  
  =item Dynamic Shared Object (DSO) support
  
  Besides Windows NT support this is one of most interesting features in
  Apache 1.3.  Its a way to build Apache modules as so-called I<dynamic
  shared objects> (usually named I<modulename.so>) which can be loaded
  via the C<LoadModule> directive in Apache's I<httpd.conf> file.  The
  benefit is that the modules are part of the C<httpd> executable only
  on demand, i.e. only when the user wants a module it is loaded into
  the address space of the C<httpd> executable.  This is interesting for
  instance in relation to memory consumption and upgrading.
  
  The DSO mechanism is provided by Apache's C<mod_so> module which needs
  to be compiled into the C<httpd> binary.  This is done automatically
  when DSO is enabled for module C<mod_foo> via:
  
    ./configure --enable-module=foo
  
  or by explicitly adding C<mod_so> via:
  
    ./configure --enable-module=so>.
  
  =item APache eXtenSion (APXS) support tool
  
  This is a new support tool from Apache 1.3 which can be used to build
  an Apache module as a DSO even B<outside> the Apache source-tree.  One
  can say C<APXS> is for Apache what C<MakeMaker> and C<XS> are for
  Perl.  It knows the platform dependent build parameters for making DSO
  files and provides an easy way to run the build commands with them.
  
  (C<MakeMaker> allows an easy automatic configuration, build,
  testing and installation of the Perl modules, and C<XS> allows to call
  functions implemented in C/C++ from Perl code.)
  
  =back
  
  Taking these four features together provides a way to integrate
  mod_perl into Apache in a very clean and smooth way.  I<No patching>
  of the Apache source tree is needed in the standard situation and in
  the APXS situation not even the Apache source tree is needed.
  
  To benefit from the above features a new hybrid build environment was
  created for the Apache side of mod_perl.  The Apache-side consists of
  the mod_perl C source files which have to be compiled into the
  C<httpd> program.  They are usually copied to the subdirectory
  I<src/modules/perl/> in the Apache source tree.  To integrate this
  subtree into the Apache build process a lot of adjustments were done
  by mod_perl's I<Makefile.PL> in the past.  And additionally the
  C<Makefile.PL> controlled the Apache build process.
  
  This approach is problematic in several ways.  It is very restrictive
  and not very clean because it assumes that mod_perl is the only
  third-party module which has to be integrated into Apache.
  
  The new approach described below avoids these problems.  It prepares
  only the I<src/modules/perl/> subtree inside the Apache source tree
  I<without> adjusting or editing anything else.  This way, no conflicts
  can occur.  Instead, mod_perl is activated later (when the Apache
  source tree is configured, via APACI calls) and then it configures
  itself.
  
  We will return to each of the above configuration mechanisms when
  describing different installation passes, once the overview of the
  four building steps is completed.
  
  
  
  =head3 Configuration parameters
  
  The command C<perl Makefile.PL>, which is also known as a
  I<"configuration stage">, accepts various parameters.  In this section
  we will learn what they are, and when should they be used.
  
  =head4 APACHE_SRC
  
  If you specify neither the C<DO_HTTPD> nor the C<NO_HTTPD> parameters
  you will be asked the following question during the configuration
  stage:
  
    Configure mod_perl with ../apache_x.x.x/src ?
  
  C<APACHE_SRC> should be used to define Apache's source tree directory.
  For example:
  
    APACHE_SRC=../apache_x.x.x/src
  
  Unless C<APACHE_SRC> is specified, I<Makefile.PL> makes an intelligent
  guess by looking at the directories at the same level as the mod_perl
  sources and suggests a directory with the highest version of Apache
  found there.
  
  Answering I<'y'> confirms either I<Makefile.PL>'s guess about the
  location of the tree, or the directory you have specified with
  C<APACHE_SRC>.
  
  If you use C<DO_HTTPD=1> or C<NO_HTTPD>, the first Apache source tree
  found or the one you have defined will be used for the rest of the
  build process.
  
  =head4 DO_HTTPD, NO_HTTPD, PREP_HTTPD
  
  Unless any of C<DO_HTTPD>, C<NO_HTTPD> or C<PREP_HTTPD> is used, you
  will be prompted by the following question:
  
    Shall I build httpd in ../apache_x.x.x/src for you?
  
  Answering I<'y'> will make sure an httpd binary will be built in
  I<../apache_x.x.x/src> when you run C<make>.
  
  To avoid this prompt when the answer is I<Yes> specify the following
  argument:
  
    DO_HTTPD=1
  
  Note that if you set C<DO_HTTPD=1>, but do not use
  C<APACHE_SRC=../apache_x.x.x/src> then the first apache source tree
  found will be used to configure and build against. Therefore it's
  highly advised to always use an explicit C<APACHE_SRC> parameter, to
  avoid confusion.
  
  C<PREP_HTTPD=1> just means default 'C<n>' to the latter prompt,
  meaning: I<Do not build (make) httpd in the Apache source tree>.  But
  it will still ask you about Apache's source location even if you have
  used the C<APACHE_SRC> parameter.  Providing the C<APACHE_SRC>
  parameter will just eliminate the need for C<perl Makefile.PL> to make
  a guess.
  
  To avoid the two prompts:
  
    Configure mod_perl with ../apache_x.x.x/src ?
    Shall I build httpd in ../apache_x.x.x/src for you?
  
  and avoid building httpd, use:
  
    NO_HTTPD=1
  
  If you choose not to build the binary you will have to do that
  manually.  We will talk about it later.  In any case you will need to
  run C<make install> in the mod_perl source tree, so the Perl side of
  mod_perl will be installed.  Note that, C<make test> won't work until
  you have built the server.
  
  
  
  =head4 Callback Hooks
  
  A callback hook (abbrev. I<callback>) is a reference to a
  subroutine. In Perl we create callbacks with the S<$callback =
  \&subroutine> syntax, where in this example, C<$callback> contains a
  reference to the subroutine called I<"subroutine">. Callbacks are used
  when we want some action (subroutine call) to occur when some event
  takes place. Since we don't know exactly when the event will take
  place we give the event handler a callback to the subroutine we want
  executed. The handler will call our subroutine at the right time.
  
  By default, most of the callback hooks except for
  C<PerlHandler>,C<PerlChildInitHandler>, C<PerlChildExitHandler>,
  C<PerlConnectionApi>, and C<PerlServerApi> are turned off.  You may
  enable them by editing I<src/modules/perl/Makefile>, or when running
  C<perl Makefile.PL>.
  
  The possible parameters for the appropriate hooks are:
  
    PERL_POST_READ_REQUEST
    PERL_TRANS
    PERL_INIT
    PERL_RESTART (experimental)
  
    PERL_HEADER_PARSER
    PERL_AUTHEN
    PERL_AUTHZ
    PERL_ACCESS
    PERL_TYPE
    PERL_FIXUP
    PERL_LOG
    PERL_CLEANUP
    PERL_CHILD_INIT
    PERL_CHILD_EXIT
    PERL_DISPATCH
    
    PERL_STACKED_HANDLERS
    PERL_METHOD_HANDLERS
    PERL_SECTIONS
    PERL_SSI
  
  As with any parameters that are either defined or not, use
  C<PERL_hookname=1> to enable them (e.g. C<PERL_AUTHEN=1>).
  
  To enable all, but the last 4 callback hooks use:
  
    ALL_HOOKS=1
  
  
  =head4 EVERYTHING
  
  To enable everything set:
  
    EVERYTHING=1
  
  
  
  =head4 PERL_TRACE
  
  To enable L<debug tracing|debug/Debug_Tracing> set: C<PERL_TRACE=1>
  
  
  
  
  =head4 APACHE_HEADER_INSTALL
  
  By default, the Apache source headers files are installed into the
  I<$Config{sitearchexp}/auto/Apache/include> directory.
  
  The reason for installing the header files is to make life simpler for
  module authors/users when building/installing a module that taps into
  some Apache C functions, e.g. C<Embperl>, C<Apache::Peek>, etc.
  
  If you don't wish to install these files use:
  
    APACHE_HEADER_INSTALL=0
  
  
  
  =head4 PERL_STATIC_EXTS
  
  Normally, if an extension is statically linked with Perl it is listed in
  C<Config.pm>'s C<$Config{static_exts}>, in which case mod_perl will
  also statically link this extension with httpd.  However, if an extension
  is statically linked with Perl after it is installed, it is not listed in
  C<Config.pm>.  You may either edit C<Config.pm> and add these
  extensions, or configure mod_perl like this:
  
   perl Makefile.PL "PERL_STATIC_EXTS=Something::Static Another::One" 
  
  
  =head4 APACI_ARGS
  
  When you use the C<USE_APACI=1> parameter, you can tell C<Makefile.PL>
  to pass any arguments you want to the Apache C<./configure> utility,
  e.g:
  
    % perl Makefile.PL USE_APACI=1 \
    APACI_ARGS='--sbindir=/usr/local/httpd_perl/sbin, \
           --sysconfdir=/usr/local/httpd_perl/etc, \
           --localstatedir=/usr/local/httpd_perl/var, \
           --runtimedir=/usr/local/httpd_perl/var/run, \
           --logfiledir=/usr/local/httpd_perl/var/logs, \
           --proxycachedir=/usr/local/httpd_perl/var/proxy'
  
  Notice that B<all> C<APACI_ARGS> (above) must be passed as one long
  line if you work with C<t?csh>!!!  However it works correctly as shown
  above (breaking the long lines with 'C<\>') with C<(ba)?sh>.  If you
  use C<t?csh> it does not work, since C<t?csh> passes the C<APACI_ARGS>
  arguments to C<./configure> leaving the newlines untouched, but
  stripping the backslashes.  This causes all the arguments except the
  first to be ignored by the configuration process.
  
  
  
  =head4 APACHE_PREFIX
  
  Alternatively to:
  
    APACI_ARGS='--prefix=/usr/local/httpd_perl'
  
  from the previous section you can use the C<APACHE_PREFIX> parameter.
  When C<USE_APACI> is enabled, this attribute will specify the
  I<--prefix> option just like the above setting does.
  
  In addition when the C<APACHE_PREFIX> option is used C<make install>
  be executed in the Apache source directory, which makes these two
  equivalent:
  
    % perl Makefile.PL APACHE_SRC=../apache_x.x.x/src \
      DO_HTTPD=1 USE_APACI=1 EVERYTHING=1 \
      APACI_ARGS='--prefix=/usr/local/httpd_perl'
    % make && make test && make install
    % cd ../apache_x.x.x
    % make install
  
    % perl Makefile.PL APACHE_SRC=../apache_x.x.x/src \
      DO_HTTPD=1 USE_APACI=1 EVERYTHING=1 \
      APACHE_PREFIX=/usr/local/httpd_perl'
    % make && make test && make install
  
  Now you can pick your favorite installation method.
  
  =head3 Environment Variables
  
  There are a few environment variables that influence the build/test
  process.
  
  =head4 APACHE_USER and APACHE_GROUP
  
  You can use the environment variables C<APACHE_USER> and
  C<APACHE_GROUP> to override the default C<User> and C<Group> settings
  in the I<httpd.conf> used for S<'make test'> stage. (Introduced in
  mod_perl v1.23.)
  
  
  =head3 Reusing Configuration Parameters
  
  When you have to upgrade the server, it's quite hard to remember what
  parameters were used in a mod_perl build.  So it's better to save them
  in a file.  For example if you create a file at
  I<~/.mod_perl_build_options>, with contents:
  
    APACHE_SRC=../apache_x.x.x/src DO_HTTPD=1 USE_APACI=1 \
    EVERYTHING=1
  
  You can build the server with the following command:
  
    % perl Makefile.PL `cat ~/.mod_perl_build_options`
    % make && make test && make install
  
  But mod_perl has a standard method to perform this trick.  If a file
  named I<makepl_args.mod_perl> is found in the same directory as the
  mod_perl build location with any of these options, it will be read in
  by I<Makefile.PL>. Parameters supplied at the command line will
  override the parameters given in this file.
  
    % ls -1 /usr/src
    apache_x.x.x/
    makepl_args.mod_perl
    mod_perl-x.xx/
    
    % cat makepl_args.mod_perl
    APACHE_SRC=../apache_x.x.x/src DO_HTTPD=1 USE_APACI=1 \
    EVERYTHING=1
    
    % cd mod_perl-x.xx
    % perl Makefile.PL
    % make && make test && make install
  
  Now the parameters from I<makepl_args.mod_perl> file will be used, as
  if they were directly typed in.
  
  Notice that this file can be located in your home directory or in the
  I<../> directory relative to the mod_perl distribution directory. This
  file can also start with dot (I<.makepl_args.mod_perl>) so you can
  keep it nicely hidden along with the rest of the dot files in your home
  directory.
  
  There is a sample I<makepl_args.mod_perl> in the I<eg/> directory of the
  mod_perl distribution package, in which you might find a few options
  to enable experimental features to play with too!
  
  If you are faced with a compiled Apache and no trace of the parameters
  used to build it, you can usually still find them if the sources were
  not C<make clean>'d.  You will find the Apache specific parameters in
  C<apache_x.x.x/config.status> and the mod_perl parameters in
  C<mod_perl_x.xx/apaci/mod_perl.config>.
  
  
  
  
  
  
  
  
  =head3 Discovering Whether Some Option Was Configured
  
  mod_perl version 1.25 has introduced C<Apache::Myconfig>, which
  provides access to the various hooks and features set when mod_perl is
  built.  This circumvents the need to set up a live server just to find
  out if a certain callback hook is available.
  
  To see whether some feature was built in or not, check the
  C<%Apache::MyConfig::Setup> hash. For example after installing
  mod_perl with the following options:
  
    panic% perl Makefile.PL EVERYTHING=1
  
  but on the next day we don't remember what callback hooks were
  enabled, and we want to know whether C<PERL_LOG> callback hook is
  enabled. One of the ways to find this out is to run the following
  code:
  
    panic% perl -MApache::MyConfig \
    -e 'print $Apache::MyConfig::Setup{PERL_LOG}'
    1
  
  which prints '1'--meaning that C<PERL_LOG> callback hook was
  enabled. (That's because C<EVERYTHING=1> enables them all.)
  
  Another approach is to configure C<Apache::Status> and run
  http://localhost/perl-status?hooks to check for enabled hooks.
  
  You also may try to look at the symbols inside the httpd executable
  with help of C<nm(1)> or a similar utility.  For example if you want
  to see whether you enabled C<PERL_LOG=1> while building mod_perl, we
  can search for a symbol with the same name but in lowercase:
  
    panic% nm httpd | grep perl_log
    08071724 T perl_logger
  
  Indeed we can see that in our example C<PERL_LOG=1> was enabled.  But
  this will only work if you have an unstripped httpd binary.  By
  default, C<make install> strips the binary before installing it. Use
  the C<--without-execstrip> C<./Configure> option to prevent stripping
  during I<make install> phase.
  
  Yet another approach that will work in most of the cases is to try to
  use the feature in question. If it wasn't configured Apache will give
  an error message
  
  
  =head3 Using an Alternative Configuration File
  
  By default mod_perl provides its own copy of the I<Configuration> file
  to Apache's C<./configure> utility.  If you wish to pass it your own
  version, do this:
  
    % perl Makefile.PL CONFIG=Configuration.custom
  
  Where I<Configuration.custom> is the pathname of the file I<relative
  to the Apache source tree you build against>.
  
  
  =head3  perl Makefile.PL Troubleshooting
  
  
  =head4 "A test compilation with your Makefile configuration failed..."
  
  When you see this during the C<perl Makefile.PL> stage:
  
    ** A test compilation with your Makefile configuration
    ** failed. This is most likely because your C compiler
    ** is not ANSI. Apache requires an ANSI C Compiler, such
    ** as gcc. The above error message from your compiler
    ** will also provide a clue.
     Aborting!
  
  you've got a problem with your compiler.  It is possible that it's
  improperly installed or not installed at all.  Sometimes the reason is
  that your Perl executable was built on a different machine, and the
  software installed on your machine is not the same.  Generally this
  happens when you install the prebuilt packages, like RPM or deb.  The
  dependencies weren't properly defined in the Perl binary package and
  you were allowed to install it, although some essential package is not
  installed.
  
  The most frequent pitfall is a missing gdbm library.  See L<Missing or
  Misconfigured libgdbm.so|install/Missing_or_Misconfigured_libgdbm> for
  more info.
  
  But why guess, when we can actually see the real error message and
  understand what the real problem is.  To get a real error message,
  edit the Apache I<src/Configure> script.  Down around line 2140 you
  will see a line like this:
  
     if ./helpers/TestCompile sanity; then
  
  change it to:
  
     if ./helpers/TestCompile -v sanity; then
  
  and try again.  Now you should get a useful error message.
  
  
  
  
  =head4 Missing or Misconfigured libgdbm.so
  
  On some Linux RedHat systems you might encounter a problem during the
  C<perl Makefile.PL> stage, when the installed from the rpm package
  Perl was built with the C<gdbm> library, but the library isn't
  actually installed.  If this is your situation make sure you install
  it before proceeding with the build process.
  
  You can check how Perl was built by running the C<perl -V> command:
  
    % perl -V | grep libs
  
  On my machine I get:
  
    libs=-lnsl -lndbm -lgdbm -ldb -ldl -lm -lc -lposix -lcrypt
  
  Sometimes the problem is even more obscure: you do have C<libgdbm>
  installed but it's not properly installed.  Do this:
  
    % ls /usr/lib/libgdbm.so*
  
  If you get at least three lines like I do:
  
    lrwxrwxrwx   /usr/lib/libgdbm.so -> libgdbm.so.2.0.0
    lrwxrwxrwx   /usr/lib/libgdbm.so.2 -> libgdbm.so.2.0.0
    -rw-r--r--   /usr/lib/libgdbm.so.2.0.0
  
  you are all set. On some installations the I<libgdbm.so> symbolic link
  is missing, so you get only:
  
    lrwxrwxrwx   /usr/lib/libgdbm.so.2 -> libgdbm.so.2.0.0
    -rw-r--r--   /usr/lib/libgdbm.so.2.0.0
  
  To fix this problem add the missing symbolic link:
  
    % cd /usr/lib
    % ln -s libgdbm.so.2.0.0 libgdbm.so
  
  Now you should be able to build mod_perl without any problems.
  
  Note that you might need to prepare this symbolic link as well:
  
    lrwxrwxrwx   /usr/lib/libgdbm.so.2 -> libgdbm.so.2.0.0
  
  with:
  
    % ln -s libgdbm.so.2.0.0 libgdbm.so.2
  
  Of course if when you read this a new version of the C<libgdbm>
  library will be released, you will have to adjust the version
  numbers. We didn't use the usual I<xx.xx> version replacement here, to
  make it easier to understand how the symbolic links should be set.
  
  
  =head4 About gdbm, db and ndbm libraries
  
  Both the gdbm and db libraries offer ndbm emulation, which is the
  interface that Apache actually uses, so when you build mod_perl you
  end up with whichever library was linked first by the perl compile.
  If you build apache without mod_perl you end up with whatever appears
  to be your ndbm library which will vary between systems, and
  especially Linux distributions.  You may have to work a bit to get
  both Apache and Perl to use the same library and you are likely to
  have trouble copying the dbm file from one system to another or even
  using it after an upgrade.
  
  
  
  =head4 Undefined reference to `PL_perl_destruct_level'
  
  When manually building mod_perl using the shared library:
  
    cd mod_perl-x.xx
    perl Makefile.PL PREP_HTTPD=1
    make
    make test
    make install
    
    cd ../apache_x.x.x
    ./configure --with-layout=RedHat --target=perlhttpd 
    --activate-module=src/modules/perl/libperl.a
  
  you might get:
  
    gcc -c  -I./os/unix -I./include   -DLINUX=2 -DTARGET=\"perlhttpd\" 
-DUSE_HSREGEX 
    -DUSE_EXPAT -I./lib/expat-lite `./apaci` buildmark.c
    gcc  -DLINUX=2 -DTARGET=\"perlhttpd\" -DUSE_HSREGEX -DUSE_EXPAT 
    -I./lib/expat-lite `./apaci`    \
          -o perlhttpd buildmark.o modules.o modules/perl/libperl.a 
    modules/standard/libstandard.a main/libmain.a ./os/unix/libos.a ap/libap.a 
    regex/libregex.a lib/expat-lite/libexpat.a  -lm -lcrypt
    modules/perl/libperl.a(mod_perl.o): In function `perl_shutdown':
    mod_perl.o(.text+0xf8): undefined reference to `PL_perl_destruct_level'
    mod_perl.o(.text+0x102): undefined reference to `PL_perl_destruct_level'
    mod_perl.o(.text+0x10c): undefined reference to `PL_perl_destruct_level'
    mod_perl.o(.text+0x13b): undefined reference to `Perl_av_undef'
    [more errors snipped]
  
  This happens when you have Perl built statically linked, with no
  shared I<libperl.a>.  Build a dynamically linked Perl (with
  I<libperl.a>) and the problem will disappear.
  
  
  =head2 mod_perl Building (make)
  
  After completing the configuration you build the server, by calling:
  
    % make
  
  which compiles the source files and creates an httpd binary and/or a
  separate library for each module, which can either be inserted into
  the httpd binary when C<make> is called from the Apache source directory
  or loaded later, at run time.
  
  Note: don't put the mod_perl directory inside the Apache
  directory. This confuses the build process.
  
  
  
  =head3 make Troubleshooting
  
  
  =head4 Undefined reference to 'Perl_newAV'
  
  This and similar error messages may show up during the C<make>
  process.  Generally it happens when you have a broken Perl
  installation.  Make sure it's not installed from a broken RPM or
  another binary package.  If it is, build Perl from source or use
  another properly built binary package.  Run C<perl -V> to learn what
  version of Perl you are using and other important details.
  
  
  =head4 Unrecognized format specifier for...
  
  This error usually reported due to the problems with some versions of
  SFIO library. Try to use the latest version to get around this
  problem. Or if you don't really need SFIO, rebuild Perl without this
  library.
  
  
  =head2 Built Server Testing (make test)
  
  After building the server, it's a good idea to test it thoroughly, by
  calling:
  
    % make test
  
  Fortunately mod_perl comes with a bunch of tests, which attempt to use
  all the features you asked for at the configuration stage.  If any of
  the tests fails, the C<make test> stage will fail.
  
  Running C<make test> will start a freshly built httpd on port 8529
  running under the uid and gid of the C<perl Makefile.PL> process.  The
  httpd will be terminated when the tests are finished.
  
  Each file in the testing suite generally includes more than one test,
  but when you do the testing, the program will only report how many
  tests were passed and the total number of tests defined in the test
  file.  However if only some of the tests in the file fail you want to know
  which ones failed.  To gain this information you should run the tests in
  verbose mode.  You can enable this mode by using the C<TEST_VERBOSE>
  parameter:
  
    % make test TEST_VERBOSE=1
  
  To change the default port (8529) used for the test do this:
  
    % perl Makefile.PL PORT=xxxx
  
  To start the newly built Apache:
  
    % make start_httpd
  
  To shutdown Apache:
  
    % make kill_httpd
  
  NOTE to Ben-SSL users: httpsd does not seem to handle I</dev/null> as
  the location of certain files (for example some of the configuration
  files mentioned in I<httpd.conf> can be ignored by reading them from
  I</dev/null>) so you'll have to change these by hand.  The tests are
  run with the C<SSLDisable> directive.
  
  
  
  =head3 Manual Testing
  
  Tests are invoked by running the C<./TEST> script located in the
  I<./t> directory.  Use the I<-v> option for verbose tests.  You might
  run an individual test like this:
  
    % t/TEST -v modules/file.t
  
  or all tests in a test sub-directory:
  
    % t/TEST modules
  
  The C<TEST> script starts the server before the test is executed.  If
  for some reason it fails to start, use C<make start_httpd> to start it
  manually.
  
  
  =head3 make test Troubleshooting
  
  =head4 make test fails
  
  You cannot run C<make test> before you build Apache, so if you told
  C<perl Makefile.PL> not to build the httpd executable, there is no
  httpd to run the test against.  Go to the Apache source tree and run
  C<make>, then return to the mod_perl source tree and continue with the
  server testing.
  
  =head4 mod_perl.c is incompatible with this version of Apache
  
  If you had a stale old Apache header layout in one of the I<include>
  paths during the build process you will see this message when you try
  to execute httpd.  Run the C<find> (or C<locate>) utility in order to
  locate the file I<ap_mmn.h>.  Delete it and rebuild Apache.  RedHat
  installed a copy of I</usr/local/include/ap_mmn.h> on my system.
  
  For all RedHat fans, before you build Apache yourself, do:
  
    % rpm -e apache
  
  to remove the pre-installed RPM package first!
  
  Users with apt systems would do:
  
    % apt-get remove apache
  
  instead.
  
  
  =head4 make test......skipping test on this platform
  
  While doing C<make test> you will notice that some of the tests are
  reported as I<skipped>.  The reason is that you are missing some
  optional modules for these test to be passed.  For a hint you might
  want to peek at the content of each test (you will find them all in
  the C<./t> directory (mnemonic - t, tests).  I'll list a few examples,
  but of course things may change in the future.
  
    modules/cookie......skipping test on this platform
    modules/request.....skipping test on this platform
  
  Install libapreq package which includes among others the
  C<Apache::Request> and C<Apache::Cookie> modules.
  
    modules/psections...skipping test on this platform
  
  Install C<Devel::Symdump> and C<Data::Dumper>
  
    modules/sandwich....skipping test on this platform
  
  Install Apache::Sandwich
  
    modules/stage.......skipping test on this platform
  
  Install Apache::Stage
  
    modules/symbol......skipping test on this platform
  
  Install Devel::Symdump
  
  Chances are that all of these are installed if you use C<CPAN.pm> to
  C<install Bundle::Apache>.
  
  =head4 make test Fails Due to Misconfigured localhost Entry
  
  The C<make test> suite uses I<localhost> to run the tests that require
  a network.  Make sure you have this entry in I</etc/hosts>:
  
    127.0.0.1       localhost.localdomain   localhost
  
  Also make sure that you have the loopback device [lo] configured.
  [Hint: try 'ifconfig lo' to test for its existence.]
  
  
  
  
  
  
  =head2 Installation (make install)
  
  After testing the server, the last step left is to install it.  First
  install all the Perl side files:
  
     % make install
  
  Then go to the Apache source tree and complete the Apache installation
  (installing the configuration files, httpd and utilities):
  
    % cd ../apache_x.x.x
    % make install
  
  Now the installation should be considered complete.  You may now
  configure your server and start using it.
  
  
  
  
  
  =head2 Building Apache and mod_perl by Hand
  
  If you wish to build httpd separately from mod_perl, you should use
  the C<NO_HTTPD=1> option during the C<perl Makefile.PL> (mod_perl
  build) stage.  Then you will need to configure various things by hand
  and proceed to build Apache.  You shouldn't run C<perl Makefile.PL>
  before following the steps described in this section.
  
  If you choose to manually build mod_perl, there are three things you
  may need to set up before the build stage:
  
  =over 4
  
  =item mod_perl's Makefile
  
  When C<perl Makefile.PL> is executed,
  I<$APACHE_SRC/modules/perl/Makefile> may need to be modified to enable
  various options (e.g. C<ALL_HOOKS=1>).
  
  Optionally, instead of tweaking the options during C<perl Makefile.PL>
  you may edit I<mod_perl-x.xx/src/modules/perl/Makefile> before running
  C<perl Makefile.PL>.
  
  =item Configuration
  
  Add to I<apache_x.x.x/src/Configuration> :
  
    AddModule modules/perl/libperl.a
  
  We suggest you add this entry at the end of the I<Configuration> file
  if you want your L<callback hooks|install/Callback_Hooks> to have
  precedence over core handlers.
  
  Add the following to C<EXTRA_LIBS>:
  
    EXTRA_LIBS=`perl -MExtUtils::Embed -e ldopts`
  
  Add the following to C<EXTRA_CFLAGS>:
  
    EXTRA_CFLAGS=`perl -MExtUtils::Embed -e ccopts` 
  
  =item mod_perl Source Files
  
  Return to the mod_perl directory and copy the mod_perl source files
  into the apache build directory:
  
    % cp -r src/modules/perl apache_x.x.x/src/modules/
  
  =back
  
  When you have done with the configuration parts, run:
  
    % perl Makefile.PL NO_HTTPD=1 DYNAMIC=1  EVERYTHING=1\
     APACHE_SRC=../apache_x.x.x/src
  
  C<DYNAMIC=1> enables a build of the shared mod_perl library.  Add
  other options if required.
  
    % make install
  
  Now you may proceed with the plain Apache build process.  Note that in
  order for your changes to the I<apache_x.x.x/src/Configuration> file
  to take effect, you must run C<apache_x.x.x/src/Configure> instead of
  the default I<apache_x.x.x/configure> script:
  
    % cd ../apache_x.x.x/src
    % ./Configure
    % make
    % make install
  
  
  
  
  
  
  =head1 Installation Scenarios for Standalone mod_perl 
  
  There are various ways available to build Apache with the new hybrid
  build environment (using C<USE_APACI=1>):
  
  
  
  
  =head2 The All-In-One Way
  
  If your goal is just to build and install Apache with mod_perl out of
  their source trees and have no special interest in further adjusting
  or enhancing Apache proceed as before:
  
    % tar xzvf apache_x.x.x.tar.gz
    % tar xzvf mod_perl-x.xx.tar.gz
    % cd mod_perl-x.xx
    % perl Makefile.PL APACHE_SRC=../apache_x.x.x/src \
      DO_HTTPD=1 USE_APACI=1 EVERYTHING=1
    % make && make test && make install
    % cd ../apache_x.x.x
    % make install
  
  This builds Apache statically with mod_perl, installs Apache under the
  default C</usr/local/apache> tree and mod_perl into the C<site_perl>
  hierarchy of your existing Perl installation.  All in one step.
  
  
  
  
  
  =head2 The Flexible Way
  
  This is the normal situation where you want to be flexible while
  building.  Statically building mod_perl into the Apache binary
  (C<httpd>) but via different steps, so you have a chance to include
  other third-party Apache modules, etc.
  
  =over 4
  
  =item 1 Prepare the Apache source tree
  
  The first step is as before, extract the distributions:
  
    % tar xvzf apache_x.x.x.tar.gz
    % tar xzvf mod_perl-x.xx.tar.gz
  
  =item 2 Install mod_perl's Perl-side and prepare the Apache-side
  
  The second step is to install the Perl-side of mod_perl into the Perl
  hierarchy and prepare the C<src/modules/perl/> subdirectory inside the
  Apache source tree:
  
    $ cd mod_perl-x.xx
    $ perl Makefile.PL \
        APACHE_SRC=../apache_x.x.x/src \
        NO_HTTPD=1 \
        USE_APACI=1 \
        PREP_HTTPD=1 \
        EVERYTHING=1 \
        [...]
    $ make
    $ make install
    $ cd ..
  
  The C<APACHE_SRC> option sets the path to your Apache source tree, the
  C<NO_HTTPD> option forces this path and only this path to be used, the
  C<USE_APACI> option triggers the new hybrid build environment and the
  C<PREP_HTTPD> option forces preparation of the
  C<APACHE_SRC/modules/perl/> tree but no automatic build.
  
  Then the configuration process prepares the Apache-side of mod_perl in
  the Apache source tree but doesn't touch anything else in it.  It then
  just builds the Perl-side of mod_perl and installs it into the Perl
  installation hierarchy.
  
  B<Important:> If you use C<PREP_HTTPD> as described above, to complete
  the build you must go into the Apache source directory and run C<make>
  and C<make install>.
  
  =item 3 Additionally prepare other third-party modules
  
  Now you have a chance to prepare third-party modules.  For instance
  the PHP language can be added in a manner similar to the mod_perl
  procedure.
  
  =item 4 Build the Apache Package
  
  Finally it's time to build the Apache package and thus also the
  Apache-side of mod_perl and any other third-party modules you've
  prepared:
  
    $ cd apache_x.x.x
    $ ./configure \
        --prefix=/path/to/install/of/apache \
        --activate-module=src/modules/perl/libperl.a \
        [...]
    $ make
    $ make install
  
  The C<--prefix> option is needed if you want to change the default
  target directory of the Apache installation and the
  C<--activate-module> option activates mod_perl for the configuration
  process and thus also for the build process. If you choose
  C<--prefix=/usr/share/apache> the Apache directory tree will be
  installed in I</usr/share/apache>.
  
  The last three steps build, test and install the Apache-side of the
  mod_perl enabled server.  Presumably your new server includes
  third-party components, otherwise you probably won't choose this
  method of building.
  
  =back
  
  The method described above enables you to insert mod_perl into Apache
  without having to mangle the Apache source tree for mod_perl.  It also
  gives you the freedom to add third-party modules.
  
  =head2 When DSO can be Used
  
  If you want to build C<mod_perl> as DSO you must make sure that Perl
  was built with system's native malloc(). If Perl was built with its
  own malloc() and C<-Dbincompat5005>, it pollutes the main C<httpd>
  program with I<free> and I<malloc> symbols.  When C<httpd> restarts
  (happens at startup too), any references in the main program to
  I<free> and I<malloc> become invalid, and this causes memory leaks and
  segfaults.
  
  Notice that mod_perl's build system warns about this problem.
  
  With Perl 5.6.0+ this pollution can be prevented with
  C<-Ubincompat5005>.  or C<-Uusemymalloc> for any version of Perl, but
  there's a chance that might hurt performance depending on platform, so
  C<-Ubincompat5005> is likely a better choice.
  
  If you get the following reports with Perl version 5.6.0+:
  
    % perl -V:usemymalloc
    usemymalloc='y';
    % perl -V:bincompat5005
    bincompat5005='define';
  
  rebuild Perl with C<-Ubincompat5005>.
  
  For Perl versions pre-5.6.x, if you get:
  
    % perl -V:usemymalloc
    usemymalloc='y';
  
  rebuild Perl with C<-Uusemymalloc>.
  
  Now rebuild mod_perl.
  
  =head2 Build mod_perl as a DSO inside the Apache Source Tree via APACI
  
  We have already said that the new mod_perl build environment
  (C<USE_APACI>) is a hybrid.  What does it mean?  It means for instance
  that the same C<src/modules/perl/> stuff can be used to build mod_perl
  as a DSO or not, without having to edit anything especially for this.
  When you want to build C<libperl.so> all you have to do is to add one
  single option to the above steps.
  
  =head3 libperl.so and libperl.a
  
  C<libmodperl.so> would be more correct for the mod_perl file, but the
  name has to be C<libperl.so> because of prehistoric Apache issues.
  Don't confuse the C<libperl.so> for mod_perl with the file of the same
  name which comes with Perl itself.  They are two different things.  It
  is unfortunate that they happen to have the same name.
  
  There is also a C<libperl.a> which comes with the Perl installation.
  That's different too.
  
  You have two options here, depending on which way you have chosen
  above: If you choose the All-In-One way from above then add
  
    USE_DSO=1
  
  to the C<perl Makefile.PL> options.  If you choose the Flexible way
  then add:
  
    --enable-shared=perl
  
  to Apache's C<./configure> options.
  
  As you can see only an additional C<USE_DSO=1> or
  C<--enable-shared=perl> option is needed.  Everything else is done
  automatically: C<mod_so> is automatically enabled, the Makefiles are
  adjusted automatically and even the C<install> target from APACI now
  additionally installs C<libperl.so> into the Apache installation tree.
  And even more: the C<LoadModule> and C<AddModule> directives (which
  dynamically load and insert mod_perl into httpd) are automatically
  added to I<httpd.conf>.
  
  
  
  =head2 Build mod_perl as a DSO outside the Apache Source Tree via APXS
  
  Above we've seen how to build mod_perl as a DSO I<inside> the Apache
  source tree.  But there is a nifty alternative: building mod_perl as a
  DSO I<outside> the Apache source tree via the new Apache 1.3 support
  tool C<apxs> (APache eXtension).  The advantage is obvious: you can
  extend an already installed Apache with mod_perl even if you don't
  have the sources (for instance, you may have installed an Apache
  binary package from your vendor).
  
  Here are the build steps:
  
    % tar xzvf mod_perl-x.xx.tar.gz
    % cd mod_perl-x.xx
    % perl Makefile.PL \
      USE_APXS=1 \
      WITH_APXS=/path/to/bin/apxs \
      EVERYTHING=1 \
       [...]
    % make && make test && make install
  
  This will build the DSO C<libperl.so> I<outside> the Apache source
  tree with the new Apache 1.3 support tool C<apxs> and install it into
  the existing Apache hierarchy.
  
  
  
  
  
  
  
  
  =head1 Installation Scenarios for mod_perl and Other Components
  
  ([ReaderMETA]: Please send more scenarios of mod_perl + other
  components installation guidelines.  Thanks!)
  
  You have now seen very detailed installation instructions for specific
  cases, but since mod_perl is used with many other components that plug
  into Apache, you will definitely want to know how to build them
  together with mod_perl.
  
  Since all the steps are simple, and assuming that you now understand
  how the build process works, I'll show only the commands to be
  executed with no comments unless there is something we haven't
  discussed before.
  
  Generally every example that I'm going to show consist of:
  
  =over 4
  
  =item 1 
  
  downloading the source distributions of the components to be used
  
  =item 1 
  
  un-packing them
  
  =item 1 
  
  configuring them
  
  =item 1 
  
  building Apache using the parameters appropriate to each component
  
  =item 1 
  
  C<make test> and C<make install>.
  
  =back
  
  All these scenarios were tested on a Linux platform, you might need to
  refer to the specific component's documentation if something doesn't
  work for you as described below.  The intention of this section is not
  to show you how to install other non-mod_perl components alone, but how to
  do this in a bundle with mod_perl.
  
  Also, notice that the links I've used below are very likely to have
  changed by the time you read this document.  That's why I have used
  the I<x.x.x> convention, instead of using hardcoded version numbers.
  Remember to replace the I<x.xx> place-holders with the version numbers
  of the distributions you are about to use.  To find out the latest
  stable version number, visit the components' sites.  So if I say
  http://perl.apache.org/dist/mod_perl-x.xx.tar.gz , go to
  http://perl.apache.org/dist/ in order to learn the version number of
  the latest stable release and download the appropriate file.
  
  Unless otherwise noted, all the components install themselves into a
  default location.  When you run C<make install> the installation
  program tells you where it's going to install the files.
  
  
  
  =head2 mod_perl and mod_ssl (+openssl)
  
  mod_ssl provides strong cryptography for the Apache 1.3 webserver via
  the Secure Sockets Layer (SSL v2/v3) and Transport Layer Security (TLS
  v1) protocols by the help of the Open Source SSL/TLS toolkit OpenSSL,
  which is based on SSLeay from Eric A. Young and Tim J. Hudson.
  
  Download the sources:
  
    % lwp-download http://www.apache.org/dist/apache_x.xx.tar.gz
    % lwp-download http://perl.apache.org/dist/mod_perl-x.xx.tar.gz
    % lwp-download http://www.modssl.org/source/mod_ssl-x.x.x-x.x.x.tar.gz
    % lwp-download http://www.openssl.org/source/openssl-x.x.x.tar.gz
  
  Un-pack:
  
    % tar xvzf mod_perl-x.xx
    % tar xvzf apache_x.x.x.tar.gz
    % tar xvzf mod_ssl-x.x.x-x.x.x.tar.gz
    % tar xvzf openssl-x.x.x.tar.gz
  
  Configure, build and install openssl:
  
    % cd openssl-x.x.x
    % ./config
    % make && make test && make install
  
  Configure:
  
    % cd mod_ssl-x.x.x-x.x.x
    % ./configure --with-apache=../apache_x.x.x
    % cd ../mod_perl-x.xx
    % perl Makefile.PL USE_APACI=1 EVERYTHING=1 \
          DO_HTTPD=1 SSL_BASE=/usr/local/ssl \
          APACHE_PREFIX=/usr/local/apachessl \
          APACHE_SRC=../apache_x.x.x/src \
          APACI_ARGS='--enable-module=ssl,--enable-module=rewrite'
  
  Note: Do not forget that if you use C<csh> or C<tcsh> you may need to
  put all the arguments to `perl Makefile.PL' on a single command line.
  
  Build, test and install:
  
    % make && make test && make install
    % cd ../apache_x.x.x
    % make certificate
    % make install
  
  Now proceed with the mod_ssl and mod_perl parts of the server
  configuration before starting the server.
  
  When the server starts you should see the following or similar in the
  I<error_log> file:
  
    [Fri Nov 12 16:14:11 1999] [notice] Apache/1.3.9 (Unix)
    mod_perl/1.21_01-dev mod_ssl/2.4.8 OpenSSL/0.9.4 configured
    -- resuming normal operations
  
  
  
  =head2 mod_perl and mod_ssl Rolled from RPMs
  
  As in the previous section this shows an installation of mod_perl and
  mod_ssl, but this time using sources/binaries prepackaged in RPMs.
  
  As always, replace I<xxx> with the proper version numbers.  And
  replace C<i386> with the identifier for your platform if it is
  different.
  
  =over
  
  =item 1
  
    % get apache-mod_ssl-x.x.x.x-x.x.x.src.rpm
  
  Source: http://www.modssl.org
  
  =item 1
  
    % get openssl-x.x.x.i386.rpm
  
  Source: http://www.openssl.org/
  
  =item 1
  
    % lwp-download http://perl.apache.org/dist/mod_perl-x.xx.tar.gz
  
  Source: http://perl.apache.org/dist
  
  =item 1
  
    % lwp-download http://www.engelschall.com/sw/mm/mm-x.x.xx.tar.gz
  
  Source: http://www.engelschall.com/sw/mm/
  
  =item 1
  
    % rpm -ivh openssl-x.x.x.i386.rpm
  
  =item 1
  
    % rpm -ivh apache-mod_ssl-x.x.x.x-x.x.x.src.rpm
  
  =item 1
  
    % cd /usr/src/redhat/SPECS
  
  =item 1
  
    % rpm -bp apache-mod_ssl.spec
  
  =item 1
  
    % cd /usr/src/redhat/BUILD/apache-mod_ssl-x.x.x.x-x.x.x
  
  =item 1
  
    % tar xvzf mod_perl-x.xx.tar.gz
  
  =item 1
  
    % cd mod_perl-x.xx
  
  =item 1
  
    % perl Makefile.PL APACHE_SRC=../apache_x.x.x/src \
       DO_HTTPD=1 \
       USE_APACI=1 \
       PREP_HTTPD=1 \
       EVERYTHING=1
  
  Add or remove parameters if appropriate.
  
  =item 1
  
    % make
  
  =item 1
  
    % make install
  
  =item 1
  
    % cd ../mm-x.x.xx/
  
  =item 1
  
    % ./configure --disable-shared
  
  =item 1
  
    % make
  
  =item 1
  
    % cd ../mod_ssl-x.x.x-x.x.x
  
  =item 1
  
    % ./configure \
          --with-perl=/usr/bin/perl \
          --with-apache=../apache_x.x.x\
          --with-ssl=SYSTEM \
          --with-mm=../mm-x.x.x \
          --with-layout=RedHat \
          --disable-rule=WANTHSREGEX \
          --enable-module=all \
          --enable-module=define \
          --activate-module=src/modules/perl/libperl.a \
          --enable-shared=max \
          --disable-shared=perl
  
  =item 1
  
    % make
  
  =item 1
  
    % make certificate 
  
  with whatever option is suitable to your configuration.
  
  =item 1
  
    % make install
  
  =back
  
  You should be all set.
  
  Note: If you use the standard config for mod_ssl don't forget to run
  Apache like this:
  
    % httpd -DSSL
  
  
  
  
  =head2 mod_perl and apache-ssl (+openssl)
  
  Apache-SSL is a secure Webserver, based on Apache and SSLeay/OpenSSL.
  It is licensed under a BSD-style license which means, in short, that
  you are free to use it for commercial or non-commercial purposes, so
  long as you retain the copyright notices.
  
  Download the sources:
  
    % lwp-download http://www.apache.org/dist/apache_x.x.x.tar.gz
    % lwp-download http://perl.apache.org/dist/mod_perl-x.xx.tar.gz
    % lwp-download http://www.apache-ssl.org/.../apache_x.x.x+ssl_x.xx.tar.gz
    % lwp-download http://www.openssl.org/source/openssl-x.x.x.tar.gz
  
  Un-pack:
  
    % tar xvzf mod_perl-x.xx
    % tar xvzf apache_x.x.x.tar.gz
    % tar xvzf openssl-x.x.x.tar.gz
  
  Configure and install openssl:
  
    % cd openssl-x.x.x
    % ./config
    % make && make test && make install
  
  Patch Apache with SSLeay paths
  
    % cd apache_x.xx
    % tar xzvf ../apache_x.x.x+ssl_x.xx.tar.gz
    % FixPatch
    Do you want me to apply the fixed-up Apache-SSL patch for you? [n] y
  
  Now edit the I<src/Configuration> file if needed and then configure:
  
    % cd ../mod_perl-x.xx
    % perl Makefile.PL USE_APACI=1 EVERYTHING=1 \
          DO_HTTPD=1 SSL_BASE=/usr/local/ssl \
          APACHE_SRC=../apache_x.x.x/src
  
  Build, test and install:
  
    % make && make test && make install
    % cd ../apache_x.x.x
    % make certificate
    % make install
  
  Note that you might need to modify the 'make test' stage, as it takes
  much longer for this server to get started and C<make test> waits only
  a few seconds for Apache to start before it times out.
  
  Now proceed with configuration of the apache_ssl and mod_perl parts of
  the server configuration files, before starting the server.
  
  
  
  
  
  
  =head2 mod_perl and Stronghold
  
  Stronghold is a secure SSL Web server for Unix which allows you to
  give your web site full-strength, 128-bit encryption.
  
  You must first build and install Stronghold without mod_perl,
  following Stronghold's install procedure.  For more information visit
  http://www.c2.net/products/sh2/ .
  
  Having done that, download the sources:
  
    % lwp-download http://perl.apache.org/dist/mod_perl-x.xx.tar.gz
  
  Unpack:
  
    % tar xvzf mod_perl-x.xx.tar.gz
  
  Configure (assuming that you have the Stronghold sources extracted at
  I</usr/local/stronghold>):
  
    % cd mod_perl-x.xx
    % perl Makefile.PL APACHE_SRC=/usr/local/stronghold/src \
      DO_HTTPD=1 USE_APACI=1 EVERYTHING=1
  
  Build:
  
    % make 
  
  Before running C<make test>, you must add your C<StrongholdKey> to
  I<t/conf/httpd.conf>.  If you are configuring by hand, be sure to edit
  I<src/modules/perl/Makefile> and uncomment the C<#APACHE_SSL>
  directive.
  
  Test and Install:
  
    % make test && make install
    % cd /usr/local/stronghold
    % make install
  
  =head3 Note For Solaris 2.5 users
  
  There has been a report related to the C<REGEX> library that comes
  with Stronghold, that after building Apache with mod_perl it would
  produce core dumps.  To work around this problem, in
  I<$STRONGHOLD/src/Configuration> change:
  
    Rule WANTHSREGEX=default
  
  to:
  
    Rule WANTHSREGEX=no
  
  
  
  
  
  
  
  
  =head2 mod_perl and mod_php
  
  This is a simple installation scenario of the mod_perl and mod_php in
  Apache server:
  
  =over
  
  =item 1
  
  Configure Apache.
  
    % cd apache_x.x.xx
    % ./configure --prefix=/usr/local/etc/httpd
  
  =item 1
  
  Build mod_perl.
  
    % cd ../mod_perl-x.xx
    % perl Makefile.PL APACHE_SRC=../apache_x.x.xx/src NO_HTTPD=1 \
      USE_APACI=1 PREP_HTTPD=1 EVERYTHING=1
    % make
  
  =item 1
  
  Build mod_php.
  
    % cd ../php-x.x.xx
    % ./configure --with-apache=../apache_x.x.xx \
       --with-mysql --enable-track-vars
    % make
    % make install
  
  =item 1
  
  Build Apache:
  
    % cd ../apache_x.x.xx
    % ./configure --prefix=/usr/local/etc/httpd \
       --activate-module=src/modules/perl/libperl.a \
       --activate-module=src/modules/php4/libphp4.a \
       --enable-module=stats \
       --enable-module=rewrite
    % make
  
  =item 1
  
  Test and install mod_perl
  
    % cd ../mod_perl-x.xx
    % make test
    # make install.
  
  =item 1
  
  Complete the Apache installation.
  
    % cd ../apache_x.x.xx
    # make install
  
  =back
  
  Note: If you need to build mod_ssl as well, make sure that you add the
  mod_ssl first.
  
  =head1 mod_perl Installation with the CPAN.pm Interactive Shell
  
  Installation of mod_perl and all the required packages is much easier
  with help of the C<CPAN.pm> module, which provides you among other
  features with a shell interface to the CPAN repository.  CPAN is the
  Comprehensive Perl Archive Network, a repository of thousands of Perl
  modules, scripts as well as a vast amount of documentation.  See
  http://cpan.org for more information.
  
  The first thing first is to download the Apache source code and unpack
  it into a directory -- the name of which you will need very soon.
  
  Now execute:
  
    % perl -MCPAN -eshell
  
  If it's the first time that you have used it, C<CPAN.pm> will ask you
  about a dozen questions to configure the module.  It's quite easy to
  accomplish this task, and very helpful hints come along with the
  questions.  When you are finished you will see the C<CPAN> prompt:
  
    cpan> 
  
  It can be a good idea to install a special C<CPAN> bundle of modules
  to make using the CPAN module easier. Installation is as simple as typing:
  
    cpan> install Bundle::CPAN
  
  The C<CPAN> shell can download mod_perl for you, unpack it, check for
  prerequisites, detect any missing third party modules, and
  download and install them.  All you need to do to install mod_perl is
  to type at the prompt:
  
    cpan> install mod_perl
  
  You will see (I'll use C<x.xx> as a placeholder for real version
  numbers, since these change very frequently):
  
    Running make for DOUGM/mod_perl-x.xx.tar.gz
    Fetching with LWP:
    http://www.cpan.org/authors/id/DOUGM/mod_perl-x.xx.tar.gz
    
    CPAN.pm: Going to build DOUGM/mod_perl-x.xx.tar.gz
    
    Enter `q' to stop search
    Please tell me where I can find your apache src
    [../apache_x.x.x/src]
  
  C<CPAN.pm> will search for the latest Apache sources and suggest a
  directory.  Here, unless the CPAN shell found it and suggested the
  right directory, you need to type the directory into which you
  unpacked Apache.  The next question is about the C<src> directory,
  which resides at the root level of the unpacked Apache distribution.
  In most cases the CPAN shell will suggest the correct directory.
  
    Please tell me where I can find your apache src
    [../apache_x.x.x/src] 
  
  Answer yes to all the following questions, unless you have a reason
  not to do that.
  
    Configure mod_perl with /usr/src/apache_x.x.x/src ? [y] 
    Shall I build httpd in /usr/src/apache_x.x.x/src for you? [y] 
  
  Now we will build Apache with mod_perl enabled.  Quit the C<CPAN>
  shell, or use another terminal.  Go to the Apache sources root
  directory and run:
  
    % make install
  
  which will complete the installation by installing Apache's headers
  and the binary in the appropriate directories.
  
  The only caveat of the process I've described is that you don't have
  control over the configuration process.  Actually, that problem is
  easy to solve -- you can tell C<CPAN.pm> to pass whatever parameters
  you want to C<perl Makefile.PL>.  You do this with C<o conf
  makepl_arg> command:
  
    cpan> o conf makepl_arg 'DO_HTTPD=1 USE_APACI=1 EVERYTHING=1'
  
  Just list all the parameters as if you were passing them to the
  familiar C<perl Makefile.PL>.  If you add the
  C<APACHE_SRC=/usr/src/apache_x.x.x/src> and C<DO_HTTPD=1> parameters,
  you will not be asked a single question.  Of course you must give the
  correct path to the Apache source distribution.
  
  Now proceed with C<install mod_perl> as before.  When the installation
  is completed, remember to unset the C<makepl_arg> variable by
  executing:
  
    cpan> o conf makepl_arg ''
  
  If you have previously set C<makepl_arg> to some value, before you
  alter it for the mod_perl installation you will probably want to save
  it somewhere so that you can restore it when you have finished with the
  mod_perl installation.  To see the original value, use:
  
    cpan> o conf makepl_arg
  
  You can now install all the modules you might want to use with
  mod_perl.  You install them all by typing a singe command:
  
    cpan> install Bundle::Apache
  
  This will install mod_perl if isn't yet installed, and many other
  packages such as: C<ExtUtils::Embed>, C<MIME::Base64>, C<URI::URL>,
  C<Digest::MD5>, C<Net::FTP>, C<LWP>, C<HTML::TreeBuilder>, C<CGI>,
  C<Devel::Symdump>, C<Apache::DB>, C<Tie::IxHash>, C<Data::Dumper> 
  etc.
  
  A helpful hint: If you have a system with all the Perl modules you use
  and you want to replicate them all elsewhere, and if you
  cannot just copy the whole C</usr/lib/perl5> directory because of a
  possible binary incompatibility on the other system, making your own
  bundle is a handy solution.  To accomplish this the command
  C<autobundle> can be used on the CPAN shell command line. This command
  writes a bundle definition file for all modules that are installed for
  the currently running perl interpreter.
  
  With the clever bundle file you can then simply say
  
    cpan> install Bundle::my_bundle
  
  and after answering a few questions, go out for a coffee.
  
  
  
  
  
  
  
  
  
  =head1 Installing on multiple machines
  
  You may wish to build httpd once, then copy it to other machines.  The
  Perl side of mod_perl needs the Apache headers files to compile.  To
  avoid dragging and build Apache on all your other machines, there are
  a few Makefile targets to help you out:
  
    % make tar_Apache
  
  This will tar all files mod_perl installs in your Perl's I<site_perl>
  directory, into a file called I<Apache.tar>.  You can then unpack this
  under the I<site_perl> directory on another machine.
  
    % make offsite-tar
  
  This will copy all the header files from the Apache source directory
  which you configured mod_perl against, then it will C<make dist> which
  creates a mod_perl-x.xx.tar.gz, ready to unpack on another machine to
  compile and install the Perl side of mod_perl.
  
  If you really want to make your life easy you should use one of the
  more advanced packaging systems.  For example, almost all Linux OS
  distributions use packaging tools on top of plain tar.gz, allowing you
  to track prerequisites for each package, and providing easy
  installation, upgrade and cleanup.  One of the most widely-used
  packagers is RPM (Red Hat Package Manager).  See http://www.rpm.org
  for more information.
  
  All you have to do is prepare a SRPM (source distribution package),
  then build a binary release. This can be installed on any number of
  machines in a matter of seconds.
  
  It will even work on live machines!  If you have two identical
  machines (identical software and hardware, although depending on your
  setup hardware may be less critical).  Let's say that one is a live server
  and the other is in development.  You build an RPM with a mod_perl
  binary distribution, install it on the development machine and satisfy
  yourself that it is working and stable.  You can then install the RPM package 
on
  the live server without any fear.  Make sure that I<httpd.conf> is correct,
  since it generally includes parameters such as hostname which are unique to
  the live machine.
  
  When you have installed the package, just restart the server.  It can
  be a good idea to keep a package of the previous system, in case
  something goes wrong.  You can then easily remove the installed
  package and put the old one back.
  
  ([ReaderMETA]: Dear reader, Can you please share a step by step scenario of
  preparation of SRPMs for mod_perl? Thanks!!!)
  
  
  
  
  
  
  
  
  
  
  
  
  =head1 using RPM and other packages to install mod_perl
  
  [ReaderMETA]: Currently only RPM package.  Please submit info about
  other available packages if you use such.
  
  
  
  =head2 A word on mod_perl RPM packages
  
  The virtues of RPM packages is a subject of much debate among mod_perl
  users.  While RPMs do take the pain away from package installation and
  maintenance for most applications, the nuances of mod_perl make RPMs
  somewhat less than ideal for those just getting started.  The
  following help and advice is for those new to mod_perl, Apache, Linux,
  and RPMs.  If you know what you are doing, this is probably Old Hat -
  contributing your past experiences is, as always, welcomed by the
  community.
  
  =head2 Getting Started
  
  If you are new to mod_perl and are using this Guide and the Eagle Book
  to help you on your way, it is probably better to grab the latest
  Apache and mod_perl sources and compile the sources yourself.  Not
  only will you find that this is less daunting than you suspect, but it
  will probably save you a few headaches down the line for several
  reasons.
  
  First, given the pace at which the open source community produces
  software, RPMs, especially those found on distribution CDs, are often
  several versions out of date.  The most recent version will not only
  be more stable, but will likely incorporate some new functionality
  that you will eventually want to play with.
  
  It is also unlikely that the file system layout of an RPM package will
  match what you see in the Eagle Book and this Guide.  If you are new
  to mod_perl, Apache, or both you will probably want to get familiar
  with the file system layout used by the examples given here before
  trying something non-standard.
  
  Finally, the RPMs found on a typical distribution's CDs use mod_perl
  built with Apache's Dynamic Shared Objects (C<DSO>) support.  While
  mod_perl can be successfully used as a DSO module, it adds a layer of
  complexity that you may want to live without for now.
  
  All that being said, should you still feel that rolling your own
  mod_perl enabled Apache server is not likely, here are a few helpful
  hints...
  
  =head2 Compiling RPM source files
  
  It is possible to compile the source files provided by RPM packages,
  but if you are using RPMs to ease mod_perl installation, that is not
  the way to do it.  Both Apache and mod_perl RPMs are designed to be
  install-and-go.  If you really want to compile mod_perl to your own
  specific needs, your best bet is to get the most recent sources from
  CPAN.
  
  =head2 Mix and Match RPM and source
  
  It is probably not the best idea to use a self-compiled Apache with a
  mod_perl RPM (or vice versa).  Sticking with one format or the other
  at first will result in fewer headaches and more hair.
  
  
  =head2 Installing a single apache+mod_perl RPM
  
  If you use an Apache+mod_perl RPM, chances are C<rpm -i> or C<glint>
  (GUI for RPM) will have you up and running immediately, no compilation
  necessary.  If you encounter problems, try downloading from another
  mirror site or searching http://rpmfind.net/ for a different package -
  there are plenty out there to choose from.
  
  David Harris has started an effort to build better RPM/SRPM mod_perl
  packages.  You will find the link to David's site from
  http://perl.apache.org/distributions.html.
  
  Features of this RPM:
  
  =over
  
  =item *
  
  Installs mod_perl as an "add in" to the RedHat Apache package, but
  does not install mod_perl as a DSO.
  
  =item *
  
  Includes the four header files required for building C<libapreq>
  (C<Apache::Request>)
  
  =item *
  
  Distributes plain text forms of the pod documentation files that come
  with mod_perl.
  
  =item *
  
  Checks the module magic number on the existing Apache package to see
  if things are compatible
  
  =back
  
  Notes on this un-conventional RPM packaging of mod_perl
  
  by David Harris E<lt>dharris (at) drh.netE<gt> on Oct 13, 1999
  
  This package will install the mod_perl library files on your machine
  along with the following two Apache files:
  
    /usr/lib/apache/mod_include_modperl.so
    /usr/sbin/httpd_modperl
  
  This package does not install a complete Apache subtree built with
  mod_perl, but rather just the two above files that are different for
  mod_perl.  This conceptually thinks of mod_perl as a kind of an "add
  on" that we would like to add to the regular Apache tree.  However, we
  are prevented from distributing mod_perl as an actual DSO, because it
  is not recommended by the mod_perl developers and various features
  must be turned off.  So, instead, we distribute an httpd binary with
  mod_perl statically linked (httpd_modperl) and the special modified
  mod_include.so required for this binary (mod_include_modperl.so).  You
  can use the exact same configuration files and other DSO modules, but
  you just "enable" the mod_perl "add on" by following the directions
  below.
  
  To enable mod_perl, do the following:
  
    (1) Configure /etc/rc.d/init.d/httpd to run httpd_modperl instead of
        httpd by changing the "daemon" command line.
    (2) Replace mod_include.so with mod_include_modperl.so in the
        module loading section of /etc/httpd/conf/httpd.conf
    (3) Uncomment the "AddModule mod_perl.c" line in /etc/httpd/conf/httpd.conf
  
  Or run the following command:
  
    /usr/sbin/modperl-enable on
  
  and to disable mod_perl:
  
    /usr/sbin/modperl-enable off
  
  
  
  
  
  
  
  
  =head2 Compiling libapreq (Apache::Request) with the RH 6.0 mod_perl RPM
  
  Libapreq provides the L<Apache::Request|download/Apache__Request>
  module.
  
  Despite many reports of libapreq not working properly with various RPM
  packages, it is possible to integrate libapreq with mod_perl RPMs.  It
  just requires a few additional steps.
  
  =over
  
  =item 1
  
  Make certain you have the C<apache-devel-x.x.x-x.i386.rpm> package
  installed.  Also, download the latest version of libapreq from CPAN.
  
  =item 2
  
  Install the source RPM for your mod_perl RPM and then do a build prep,
  (with C<rpm -bp apache-devel-x.x.x-x.src.rpm>) which unpacks the
  sources.  From there, copy the four header files (I<mod_perl.h>,
  I<mod_perl_version.h>, I<mod_perl_xs.h>, and I<mod_PL.h>) to
  C</usr/include/apache>.
  
  =over
  
  =item *
  
  2.1 Get the SRPM from
  C<somemirror.../redhat-x.x/SRPMS/mod_perl-x.xx-x.src.rpm>.
  
  =item *
  
  2.2 Install the SRPM.  This creates files in C</usr/src/redhat/SPECS>
  and C</usr/src/redhat/SOURCES>.  Run:
  
   % rpm -ih mod_perl-x.xx-x.src.rpm
  
  =item *
  
  2.3 Do a C<"prep"> build of the package, which just unpackages the
  sources and applies any patches.
  
    % rpm -bp /usr/src/redhat/SPECS/mod_perl.spec
    Executing: %prep
    + umask 022
    + cd /usr/src/redhat/BUILD
    + cd /usr/src/redhat/BUILD
    + rm -rf mod_perl-1.19
    + /bin/gzip -dc /usr/src/redhat/SOURCES/mod_perl-1.19.tar.gz
    + tar -xf -
    + STATUS=0
    + [ 0 -ne 0 ]
    + cd mod_perl-1.19
    ++ /usr/bin/id -u
    + [ 0 = 0 ]
    + /bin/chown -Rf root .
    ++ /usr/bin/id -u
    + [ 0 = 0 ]
    + /bin/chgrp -Rf root .
    + /bin/chmod -Rf a+rX,g-w,o-w .
    + echo Patch #0:
    Patch #0:
    + patch -p1 -b --suffix .rh -s
    + exit 0
  
  NOTE: Steps 2.1 through 2.3 are just a fancy un-packing of the source
  tree that builds the RPM into C</usr/src/redhat/BUILD/mod_perl-x.xx>.
  You could unpack the C<mod_perl-x.xx.tar.gz> file somewhere and then
  do the following steps on that source tree.  The method shown above is
  more "pure" because you're grabbing the header files from the same
  tree that built the RPM.  But this does not matter because RedHat is
  not patching that file.  So, it might be better if you just grab the
  mod_perl source and unpack it to get these files.  Less fuss and mess.
  
  =item *
  
  2.4 Look at the files you will copy: (this is not really a step, but
  useful to show)
  
    % find /usr/src/redhat/BUILD/mod_perl-1.19 -name '*.h'
    /usr/src/redhat/BUILD/mod_perl-1.19/src/modules/perl/mod_perl.h
    /usr/src/redhat/BUILD/mod_perl-1.19/src/modules/perl/mod_perl_xs.h
    /usr/src/redhat/BUILD/mod_perl-1.19/src/modules/perl/mod_perl_version.h
    /usr/src/redhat/BUILD/mod_perl-1.19/src/modules/perl/perl_PL.h
  
  =item *
  
  2.5 Copy the files into C</usr/include/apache>.
  
    % find /usr/src/redhat/BUILD/mod_perl-1.19 -name '*.h' \
      -exec cp {} /usr/include/apache \;
  
  NOTE: You should not have to do:
  
    % mkdir /usr/include/apache
  
  because that directory should be created by apache-devel.
  
  =back
  
  =item 3
  
  Apply this patch to libapreq:
  http://www.davideous.com/modperlrpm/distrib/libapreq-0.31_include.patch
  
  
  =item 4
  
  Follow the libapreq directions as usual:
  
    % perl Makefile.PL
    % make && make test && make install
  
  =back
  
  
  =head2 Installing separate Apache and mod_perl RPMs
  
  If you are trying to install separate Apache and mod_perl RPMs, like
  those provided by the RedHat distributions, you may be in for a bit of
  a surprise.  Installing the Apache RPM will go just fine, and
  http://localhost will bring up some type of web page for you.
  However, after installation of the mod_perl RPM, the L<How can I tell
  whether mod_perl is running|install/How_can_I_tell_whether_mod_perl_>
  test will show that Apache is not mod_perl enabled.  This is because
  mod_perl needs to be added as a separate module using Apache's Dynamic
  Shared Objects.
  
  To use mod_perl as a DSO, make the following modifications to your
  Apache configuration files:
  
  
    httpd.conf:
    ----------
    LoadModule perl_module modules/libperl.so
    AddModule mod_perl.c
    
    PerlModule Apache::Registry 
    Alias /perl/ /home/httpd/perl/ 
    <Location /perl>
      SetHandler perl-script 
      PerlHandler Apache::Registry 
      PerlSendHeader On 
      Options +ExecCGI
    </Location>
  
  After a complete shutdown and startup of the server, mod_perl should
  be up and running.
  
  
  
  
  
  
  
  
  
  =head2 Testing the mod_perl API
  
  Some people have reported that even when the server responds
  positively to the L<How can I tell whether mod_perl is
  running|install/How_can_I_tell_whether_mod_perl_> tests, the mod_perl
  API will not function properly.  You may want to run the following script
  to verify the availability of the mod_perl API.
  
        use strict;
        my $r = shift;
        $r->send_http_header('text/html');
        $r->print("It worked!!!\n");
  
  
  
  
  
  
  
  
  
  =head1 Installation Without Superuser Privileges
  
  As you have already learned, mod_perl enabled Apache consists of two
  main components: Perl modules and Apache itself.  Let's tackle the
  tasks one at a time.
  
  I'll show a complete installation example using I<stas> as a username,
  assuming that I</home/stas> is the home directory of that user.
  
  
  =head2 Installing Perl Modules into a Directory of Choice
  
  Since without superuser permissions you aren't allowed to install
  modules into system directories like I</usr/lib/perl5>, you need to
  find out how to install the modules under your home directory.  It's
  easy.
  
  First you have to decide where to install the modules.  The simplest
  approach is to simulate the portion of the I</> file system relevant
  to Perl under your home directory.  Actually we need only two
  directories:
  
    /home/stas/bin
    /home/stas/lib
  
  We don't have to create them, since that will be done automatically
  when the first module is installed.  99% of the files will go into the
  I<lib> directory.  Occasionally, when some module distribution comes
  with Perl scripts, these will go into the I<bin> directory.  This
  directory will be created if it doesn't exist.
  
  Let's install the I<CGI.pm> package, which includes a few other
  C<CGI::*> modules.  As usual, download the package from the CPAN
  repository, unpack it and C<chdir> to the newly-created directory.
  
  Now do a standard C<perl Makefile.PL> to prepare a I<Makefile>, but
  this time tell C<MakeMaker> to use your Perl installation directories
  instead of the defaults.
  
    % perl Makefile.PL PREFIX=/home/stas
  
  C<PREFIX=/home/stas> is the only part of the installation process
  which is different from usual.  Note that if you don't like how
  C<MakeMaker> chooses the rest of the directories, or if you are using
  an older version of it which requires an explicit declaration of all
  the target directories, you should do this:
  
    % perl Makefile.PL PREFIX=/home/stas \
      INSTALLPRIVLIB=/home/stas/lib/perl5 \
      INSTALLSCRIPT=/home/stas/bin \
      INSTALLSITELIB=/home/stas/lib/perl5/site_perl \
      INSTALLBIN=/home/stas/bin \
      INSTALLMAN1DIR=/home/stas/lib/perl5/man  \
      INSTALLMAN3DIR=/home/stas/lib/perl5/man3
  
  The rest is as usual:
  
    % make
    % make test
    % make install
  
  C<make install> installs all the files in the private repository.
  Note that all the missing directories are created automatically, so
  there is no need to create them in first place.  Here (slightly
  edited) is what it does :
  
    Installing /home/stas/lib/perl5/CGI/Cookie.pm
    Installing /home/stas/lib/perl5/CGI.pm
    Installing /home/stas/lib/perl5/man3/CGI.3
    Installing /home/stas/lib/perl5/man3/CGI::Cookie.3
    Writing /home/stas/lib/perl5/auto/CGI/.packlist
    Appending installation info to /home/stas/lib/perl5/perllocal.pod
  
  
  If you have to use the explicit target parameters, instead of a single
  C<PREFIX> parameter, you will find it useful to create a file called
  for example I<~/.perl_dirs> (where I<~> is C</home/stas> in our
  example) containing:
  
      PREFIX=/home/stas \
      INSTALLPRIVLIB=/home/stas/lib/perl5 \
      INSTALLSCRIPT=/home/stas/bin \
      INSTALLSITELIB=/home/stas/lib/perl5/site_perl \
      INSTALLBIN=/home/stas/bin \
      INSTALLMAN1DIR=/home/stas/lib/perl5/man  \
      INSTALLMAN3DIR=/home/stas/lib/perl5/man3
  
  From now on, any time you want to install perl modules locally you
  simply execute:
  
    % perl Makefile.PL `cat ~/.perl_dirs`
    % make
    % make test
    % make install
  
  Using this method you can easily maintain several Perl module
  repositories.  For example, you could have one for production Perl and
  another for development:
  
    % perl Makefile.PL `cat ~/.perl_dirs.production`
  
  or
  
    % perl Makefile.PL `cat ~/.perl_dirs.develop`
  
  
  
  
  
  
  
  
  
  
  
  
  
  =head2 Making Your Scripts Find the Locally Installed Modules
  
  Perl modules are generally placed in four main directories.  To find
  these directories, execute:
  
    % perl -V
  
  The output contains important information about your Perl
  installation.  At the end you will see:
  
    Characteristics of this binary (from libperl):
    Built under linux
    Compiled at Apr  6 1999 23:34:07
    @INC:
      /usr/lib/perl5/5.00503/i386-linux
      /usr/lib/perl5/5.00503
      /usr/lib/perl5/site_perl/5.005/i386-linux
      /usr/lib/perl5/site_perl/5.005
      .
  
  It shows us the content of the Perl special variable C<@INC>, which is
  used by Perl to look for its modules.  It is equivalent to the C<PATH>
  environment variable in Unix shells which is used to find executable
  programs.
  
  Notice that Perl looks for modules in the I<.> directory too, which
  stands for the current directory. It's the last entry in the above
  output.
  
  Of course this example is from version I<5.00503> of Perl installed on
  my x86 architecture PC running Linux.  That's why you see
  I<i386-linux> and I<5.00503>.  If your system runs a different version
  of Perl, operating system, processor or chipset architecture, then
  some of the directories will have different names.
  
  I also have a perl-5.6.0 installed under C</usr/local/lib/>
  so when I do:
  
    % /usr/local/bin/perl5.6.0 -V
  
  I see:
  
    @INC:
      /usr/local/lib/perl5/5.6.0/i586-linux
      /usr/local/lib/perl5/5.6.0
      /usr/local/lib/site_perl/5.6.0/i586-linux
      /usr/local/lib/site_perl
  
  Note that it's still I<Linux>, but the newer Perl version uses the
  version of my Pentium processor (thus the I<i586> and not I<i386>).
  This makes use of compiler optimizations for Pentium processors when
  the binary Perl extensions are created.
  
  All the platform specific files, such as compiled C files glued to
  Perl with C<XS> or C<SWIG>, are supposed to go into the
  C<i386-linux>-like directories.
  
  B<Important:> As we have installed the Perl modules into non-standard
  directories, we have to let Perl know where to look for the four
  directories.  There are two ways to accomplish this.  You can either
  set the C<PERL5LIB> environment variable, or you can modify the
  C<@INC> variable in your scripts.
  
  Assuming that we use perl-5.00503, in our example the directories are:
  
      /home/stas/lib/perl5/5.00503/i386-linux
      /home/stas/lib/perl5/5.00503
      /home/stas/lib/perl5/site_perl/5.005/i386-linux
      /home/stas/lib/perl5/site_perl/5.005
  
  As mentioned before, you find the exact directories by executing
  C<perl -V> and replacing the global Perl installation's base
  directory with your home directory.
  
  Modifying C<@INC> is quite easy.  The best approach is to use the
  C<lib> module (pragma), by adding the following snippet at the top of
  any of your scripts that require the locally installed modules.
  
    use lib qw(/home/stas/lib/perl5/5.00503/
               /home/stas/lib/perl5/site_perl/5.005);
  
  Another way is to write code to modify C<@INC> explicitly:
  
    BEGIN {
      unshift @INC,
        qw(/home/stas/lib/perl5/5.00503
           /home/stas/lib/perl5/5.00503/i386-linux
           /home/stas/lib/perl5/site_perl/5.005
           /home/stas/lib/perl5/site_perl/5.005/i386-linux);
          }
  
  Note that with the C<lib> module we don't have to list the
  corresponding architecture specific directories, since it adds them
  automatically if they exist (to be exact, when I<$dir/$archname/auto>
  exists).
  
  Also, notice that both approaches I<prepend> the directories to be
  searched to C<@INC>.  This allows you to install a more recent module
  into your local repository and Perl will use it instead of the older
  one installed in the main system repository.
  
  Both approaches modify the value of C<@INC> at compilation time.  The
  C<lib> module uses the I<BEGIN> block as well, but internally.
  
  Now, let's assume the following scenario.  I have installed the C<LWP>
  package in my local repository.  Now I want to install another module
  (e.g. mod_perl) and it has C<LWP> listed in its prerequisites list.  I
  know that I have C<LWP> installed, but when I run C<perl Makefile.PL>
  for the module I'm about to install I'm told that I don't have C<LWP>
  installed.
  
  There is no way for Perl to know that we have some locally installed
  modules.  All it does is search the directories listed in C<@INC>, and
  since the latter contains only the default four directories (plus the
  I<.> directory), it cannot find the locally installed C<LWP> package.
  We cannot solve this problem by adding code to modify C<@INC>, but
  changing the C<PERL5LIB> environment variable will do the trick.  If
  you are using C<t?csh> for interactive work, do this:
  
    setenv PERL5LIB /home/stas/lib/perl5/5.00503:
    /home/stas/lib/perl5/site_perl/5.005
  
  It should be a single line with directories separated by colons (C<:>)
  and no spaces.  If you are a C<(ba)?sh> user, do this:
  
    export PERL5LIB=/home/stas/lib/perl5/5.00503:
    /home/stas/lib/perl5/site_perl/5.005
  
  Again make it a single line.  If you use bash you can use multi-line
  commands by terminating split lines with a backslash (C<\>), like this:
  
    export PERL5LIB=/home/stas/lib/perl5/5.00503:\
    /home/stas/lib/perl5/site_perl/5.005
  
  As with C<use lib>, perl automatically prepends the architecture
  specific directories to C<@INC> if those exist.
  
  When you have done this, verify the value of the newly configured
  C<@INC> by executing C<perl -V> as before.  You should see the
  modified value of C<@INC>:
  
    % perl -V
    
    Characteristics of this binary (from libperl): 
    Built under linux
    Compiled at Apr  6 1999 23:34:07
    %ENV:
      PERL5LIB="/home/stas/lib/perl5/5.00503:
      /home/stas/lib/perl5/site_perl/5.005"
    @INC:
      /home/stas/lib/perl5/5.00503/i386-linux
      /home/stas/lib/perl5/5.00503
      /home/stas/lib/perl5/site_perl/5.005/i386-linux
      /home/stas/lib/perl5/site_perl/5.005
      /usr/lib/perl5/5.00503/i386-linux
      /usr/lib/perl5/5.00503
      /usr/lib/perl5/site_perl/5.005/i386-linux
      /usr/lib/perl5/site_perl/5.005
      .
  
  When everything works as you want it to, add these commands to your
  I<.tcshrc> or I<.bashrc> file.  The next time you start a shell, the
  environment will be ready for you to work with the new Perl.
  
  Note that if you have a C<PERL5LIB> setting, you don't need to alter
  the C<@INC> value in your scripts.  But if for example someone else
  (who doesn't have this setting in the shell) tries to execute your
  scripts, Perl will fail to find your locally installed modules. The
  best example is a crontab script that I<might> use a different SHELL
  environment and therefore the C<PERL5LIB> setting won't be available
  to it.
  
  So the best approach is to have both the C<PERL5LIB> environment
  variable and the explicit C<@INC> extension code at the beginning of
  the scripts as described above.
  
  
  
  
  
  
  
  
  
  
  
  
  =head2 The CPAN.pm Shell and Locally Installed Modules
  
  As we saw in the section describing the usage of the C<CPAN.pm> shell
  to install mod_perl, it saves a great deal of time.  It does the job
  for us, even detecting the missing modules listed in prerequisites,
  fetching and installing them.  So you might wonder whether you can use
  C<CPAN.pm> to maintain your local repository as well.
  
  When you start the C<CPAN> interactive shell, it searches first for
  the user's private configuration file and then for the system wide
  one.  When I'm logged as user C<stas> the two files on my setup are:
  
      /home/stas/.cpan/CPAN/MyConfig.pm
      /usr/lib/perl5/5.00503/CPAN/Config.pm
  
  If there is no C<CPAN> shell configured on your system, when you
  start the shell for the first time it will ask you a dozen
  configuration questions and then create the I<Config.pm> file for you.
  
  If you've got it already system-wide configured, you should have a
  C</usr/lib/perl5/5.00503/CPAN/Config.pm>.  If you have a different
  Perl version, alter the path to use your Perl's version number, when
  looking up the file.  Create the directory (C<mkdir -p> creates the
  whole path at once) where the local configuration file will go:
  
    % mkdir -p /home/stas/.cpan/CPAN
  
  Now copy the system wide configuration file to your local one.
  
    % cp /usr/lib/perl5/5.00503/CPAN/Config.pm \
      /home/stas/.cpan/CPAN/MyConfig.pm
  
  The only thing left is to change the base directory of I<.cpan> in
  your local file to the one under your home directory.  On my machine I
  replace C</usr/src/.cpan> (that's where my system's C<.cpan> directory
  resides) with C</home/stas>.  I use Perl of course!
  
    % perl -pi -e 's|/usr/src|/home/stas|' \
    /home/stas/.cpan/CPAN/MyConfig.pm
  
  Now you have the local configuration file ready, you have to tell it
  what special parameters you need to pass when executing the C<perl
  Makefile.PL> stage.
  
  Open the file in your favorite editor and replace line:
  
    'makepl_arg' => q[],
  
  with:
  
    'makepl_arg' => q[PREFIX=/home/stas],
  
  Now you've finished the configuration.  Assuming that you are logged in
  as the same user you have prepared the local installation for (I<stas>
  in our example), start it like this:
  
    % perl -MCPAN -e shell
  
  From now on any module you try to install will be installed locally.
  If you need to install some system modules, just become the superuser
  and install them in the same way.  When you are logged in as the
  superuser, the system-wide configuration file will be used instead of
  your local one.
  
  If you have used more than just the C<PREFIX> variable, modify
  I<MyConfig.pm> to use them.  For example if you have used these
  variables:
  
      perl Makefile.PL PREFIX=/home/stas \
      INSTALLPRIVLIB=/home/stas/lib/perl5 \
      INSTALLSCRIPT=/home/stas/bin \
      INSTALLSITELIB=/home/stas/lib/perl5/site_perl \
      INSTALLBIN=/home/stas/bin \
      INSTALLMAN1DIR=/home/stas/lib/perl5/man  \
      INSTALLMAN3DIR=/home/stas/lib/perl5/man3
  
  then replace C<PREFIX=/home/stas> in the line:
  
    'makepl_arg' => q[PREFIX=/home/stas],
  
  with all the variables from above, so that the line becomes:
  
    'makepl_arg' => q[PREFIX=/home/stas \
      INSTALLPRIVLIB=/home/stas/lib/perl5 \
      INSTALLSCRIPT=/home/stas/bin \
      INSTALLSITELIB=/home/stas/lib/perl5/site_perl \
      INSTALLBIN=/home/stas/bin \
      INSTALLMAN1DIR=/home/stas/lib/perl5/man  \
      INSTALLMAN3DIR=/home/stas/lib/perl5/man3],
  
  If you arrange all the above parameters in one line, you can remove
  the backslashes (C<\>).
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  =head2 Making a Local Apache Installation
  
  Just like with Perl modules, if you don't have permissions to install
  files into the system area you have to install them locally under your
  home directory.  It's almost the same as a plain installation, but you
  have to run the server listening to a port number greater than 1024
  since only root processes can listen to lower numbered ports.
  
  Another important issue you have to resolve is how to add startup and
  shutdown scripts to the directories used by the rest of the system
  services.  You will have to ask your system administrator to assist
  you with this issue.
  
  To install Apache locally, all you have to do is to tell C<.configure>
  in the Apache source directory what target directories to use.  If you
  are following the convention that I use, which makes your home
  directory look like the C</> (base) directory, the invocation
  parameters would be:
  
    ./configure --prefix=/home/stas
  
  Apache will use the prefix for the rest of its target directories
  instead of the default C</usr/local/apache>.  If you want to see what
  they are, before you proceed add the I<--show-layout> option:
  
    ./configure --prefix=/home/stas --show-layout
  
  You might want to put all the Apache files under C</home/stas/apache>
  following Apache's convention:
  
    ./configure --prefix=/home/stas/apache
  
  If you want to modify some or all of the names of the automatically
  created directories:
  
    ./configure --prefix=/home/stas/apache \
      --sbindir=/home/stas/apache/sbin
      --sysconfdir=/home/stas/apache/etc
      --localstatedir=/home/stas/apache/var \
      --runtimedir=/home/stas/apache/var/run \
      --logfiledir=/home/stas/apache/var/logs \
      --proxycachedir=/home/stas/apache/var/proxy
  
  That's all!
  
  Also remember that you can start the script only under a user and
  group you belong to.  You must set the C<User> and C<Group> directives
  in I<httpd.conf> to appropriate values.
  
  
  
  
  
  
  
  
  
  
  
  
  =head2 Manual Local mod_perl Enabled Apache Installation
  
  Now when we have learned how to install local Apache and Perl modules
  separately, let's see how to install mod_perl enabled Apache in our
  home directory.  It's almost as simple as doing each one separately,
  but there is one wrinkle you need to know about which I'll mention at
  the end of this section.
  
  Let's say you have unpacked the Apache and mod_perl sources under
  I</home/stas/src> and they look like this:
  
    % ls /home/stas/src
    /home/stas/src/apache_x.x.x
    /home/stas/src/mod_perl-x.xx
  
  where I<x.xx> are the version numbers as usual.  You want the Perl
  modules from the mod_perl package to be installed under
  I</home/stas/lib/perl5> and the Apache files to go under
  I</home/stas/apache>.  The following commands will do that for you:
  
    % perl Makefile.PL \
    PREFIX=/home/stas \
    APACHE_PREFIX=/home/stas/apache \
    APACHE_SRC=../apache_x.x.x/src \
    DO_HTTPD=1 \
    USE_APACI=1 \
    EVERYTHING=1
    % make && make test && make install 
    % cd ../apache_x.x.x
    % make install
  
  If you need some parameters to be passed to the C<.configure> script,
  as we saw in the previous section use C<APACI_ARGS>.  For example:
  
    APACI_ARGS='--sbindir=/home/stas/apache/sbin, \
      --sysconfdir=/home/stas/apache/etc, \
      --localstatedir=/home/stas/apache/var, \
      --runtimedir=/home/stas/apache/var/run, \
      --logfiledir=/home/stas/apache/var/logs, \
      --proxycachedir=/home/stas/apache/var/proxy'
  
  Note that the above multiline splitting will work only with
  C<(ba)?sh>, C<t?csh> users will have to list all the parameters on a
  single line.
  
  Basically the installation is complete.  The only remaining problem is
  the C<@INC> variable.  This won't be correctly set if you rely on the
  C<PERL5LIB> environment variable unless you set it explicitly in a
  startup file which is C<require>'d before loading any other module
  that resides in your local repository.  A much nicer approach is to
  use the C<lib> pragma as we saw before, but in a slightly different
  way--we use it in the startup file and it affects all the code that
  will be executed under mod_perl handlers.  For example:
  
    PerlRequire /home/stas/apache/perl/startup.pl
  
  where C<startup.pl> starts with:
  
    use lib qw(/home/stas/lib/perl5/5.00503/
               /home/stas/lib/perl5/site_perl/5.005);
  
  Note that you can still use the hard-coded C<@INC> modifications in
  the scripts themselves, but be aware that scripts modify C<@INC> in
  C<BEGIN> blocks and mod_perl executes the C<BEGIN> blocks only when it
  performs script compilation.  As a result, C<@INC> will be reset to its
  original value after the scripts are compiled and the hard-coded
  settings will be forgotten.  See the section 'L<@INC and
  mod_perl|porting/_INC_and_mod_perl>' for more information.
  
  
  The only place you can alter the "original" value is during the server
  configuration stage either in the startup file or by putting
  
    PerlSetEnv Perl5LIB \
    /home/stas/lib/perl5/5.00503/:/home/stas/lib/perl5/site_perl/5.005
  
  in I<httpd.conf>, but the latter setting will be ignored if you use
  the C<PerlTaintcheck> setting, and I hope you do use it.
  
  The rest of the mod_perl configuration and use is just the same as if
  you were installing mod_perl as superuser.
  
  =head3 Resource Usage
  
  Another important thing to keep in mind is the consumption of system
  resources.  mod_perl is memory hungry.  If you run a lot of mod_perl
  processes on a public, multiuser machine, most likely the system
  administrator of this machine will ask you to use less resources and
  may even shut down your mod_perl server and ask you to find another
  home for it.  You have a few options:
  
  =over 4
  
  =item *
  
  Reduce resources usage (see L<Limiting the size of the
  processes|performance/Limiting_the_Size_of_the_Process>).
  
  =item *
  
  Ask your ISP's system administrator whether they can setup a dedicated
  machine for you, so that you will be able to install as much memory as
  you need.  If you get a dedicated machine the chances are that you
  will want to have root access, so you may be able to manage the
  administration yourself.  Then you should consider keeping on the list
  of the system administrator's responsibilities the following items: a
  reliable electricity supply and network link. And of course making
  sure that the important security patches get applied and the machine
  is configured to be secure. Finally having the machine physically
  protected, so no one will turn off the power or break it.
  
  =item *
  
  Look for another ISP with lots of resources or one that supports
  mod_perl. You can find a list of these ISPs at http://perl.apache.org .
  
  =back
  
  
  
  
  
  
  
  
  
  
  
  =head2 Local mod_perl Enabled Apache Installation with CPAN.pm
  
  Again, CPAN makes installation and upgrades simpler.  You have seen
  how to install a mod_perl enabled server using C<CPAN.pm>'s
  interactive shell.  You have seen how to install Perl modules and
  Apache locally.  Now all you have to do is to merge these techniques
  into a single "local mod_perl Enabled Apache Installation with
  CPAN.pm" technique.
  
  Assuming that you have configured C<CPAN.pm> to install Perl modules
  locally, the installation is very simple.  Start the C<CPAN.pm> shell,
  set the arguments to be passed to C<perl Makefile.PL> (modify the
  example setting to suit your needs), and tell C<CPAN.pm> to do the rest
  for you:
  
    % perl -MCPAN -eshell
    cpan> o conf makepl_arg 'DO_HTTPD=1 USE_APACI=1 EVERYTHING=1 \
          PREFIX=/home/stas APACHE_PREFIX=/home/stas/apache'
    cpan> install mod_perl
  
  When you use C<CPAN.pm> for local installations, after the mod_perl
  installation is complete you must make sure that the value of
  C<makepl_arg> is restored to its original value.
  
  The simplest way to do this is to quit the interactive shell by typing
  I<quit> and reenter it. But if you insist here is how to make it work
  without quitting the shell. You really want to skip this :)
  
  If you want to continue working with C<CPAN> *without* quitting the
  shell, you must:
  
  =over 4
  
  =item 1 remember the value of C<makepl_arg>
  
  =item 2 change it to suit your new installation
  
  =item 3 build and install mod_perl
  
  =item 4 restore it after completing mod_perl installation
  
  =back
  
  this is quite a cumbersome task as of this writing, but I believe that
  C<CPAN.pm> will eventually be improved to handle this more easily.
  
  So if you are still with me, start the shell as usual:
  
    % perl -MCPAN -eshell
  
  First, read the value of the C<makepl_arg>:
  
    cpan> o conf makepl_arg 
  
    PREFIX=/home/stas
  
  It will be something like C<PREFIX=/home/stas> if you configured
  C<CPAN.pm> to install modules locally.  Save this value:
  
    cpan> o conf makepl_arg.save PREFIX=/home/stas
  
  
  Second, set a new value, to be used by the mod_perl installation
  process.  (You can add parameters to this line, or remove them,
  according to your needs.)
  
    cpan> o conf makepl_arg 'DO_HTTPD=1 USE_APACI=1 EVERYTHING=1 \
          PREFIX=/home/stas APACHE_PREFIX=/home/stas/apache'
  
  Third, let C<CPAN.pm> build and install mod_perl for you:
  
    cpan> install mod_perl
  
  Fourth, reset the original value to C<makepl_arg>.  We do this by
  printing the value of the saved variable and assigning it to
  C<makepl_arg>.
  
    cpan> o conf makepl_arg.save
  
    PREFIX=/home/stas
  
    cpan> o conf makepl_arg PREFIX=/home/stas
  
  Not so neat, but a working solution.  You could have written the value
  on a piece of paper instead of saving it to C<makepl_arg.save>, but
  you are more likely to make a mistake that way.
  
  
  
  
  
  =head1 Automating installation
  
  =over
  
  =item * Apache Builder
  
  James G Smith wrote an Apache Builder, that can install a combination
  of Apache, mod_perl, and mod_ssl -- it also has limited support for
  including mod_php in the mix.
  
  http://hex.tamu.edu/build-apache  (the actual Perl script)
  
  http://hex.tamu.edu/generic.conf  (a sample configuration file)
  
  =item * Aphid Apache Installer
  
  Aphid provides a facility for bootstrapping SSL-enabled Apache web
  servers (mod_ssl) with an embedded Perl interpreter (mod_perl). Source
  is downloaded from the Internet, compiled, and the resulting system is
  installed in the directory you specify.
  
  http://sourceforge.net/projects/aphid/
  
  =back
  
  
  
  
  
  
  
  
  
  
  =head1 How can I tell whether mod_perl is running?
  
  There are a few ways. In older versions of apache ( < 1.3.6 ?) you
  could check that by running C<httpd -v>, but it no longer works.  Now
  you should use C<httpd -l>.  Please note that it is not enough to have
  it installed, you have to configure it for mod_perl and restart the
  server too.
  
  
  =head2 Checking the error_log
  
  When starting the server, just check the C<error_log> file for the
  following message:
  
    [Thu Dec  3 17:27:52 1998] [notice] Apache/1.3.1 (Unix) mod_perl/1.15 
configured 
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      -- resuming normal operations
  
  =head2 Testing by viewing /perl-status
  
  Assuming that you have configured the S<E<lt>Location
  /perl-statusE<gt>> section in the server configuration file fetch:
  http://www.example.com/perl-status using your favorite Mozilla browser
  :-)
  
  You should see something like this:
  
    Embedded Perl version 5.00503 for Apache/1.3.9 (Unix) mod_perl/1.21 
    process 50880, running since Mon Dec 6 14:31:45 1999
  
  =head2 Testing via telnet
  
  Knowing the port you have configured apache to listen on, you can use
  C<telnet> to talk directly to it.
  
  Assuming that your mod_perl enabled server listens to port 8080,
  telnet to your server at port 8080, and type C<HEAD / HTTP/1.0> then
  press the C<ENTER> key TWICE:
  
    % telnet localhost 8080<ENTER>
    HEAD / HTTP/1.0<ENTER><ENTER>
  
  You should see a response like this:
  
    HTTP/1.1 200 OK
    Date: Mon, 06 Dec 1999 12:27:52 GMT
    Server: Apache/1.3.9 (Unix) mod_perl/1.21
    Connection: close
    Content-Type: text/html
    
    Connection closed.
  
  The line
  
    Server: Apache/1.3.9 (Unix) mod_perl/1.21
  
  confirms that you have mod_perl installed and its version is C<1.21>.
  
  However, just because you have got mod_perl linked in there, that does
  not mean that you have configured your server to handle Perl scripts
  with mod_perl.  You will find configuration assistance at
  L<ModPerlConfiguration|config/>
  
  =head2 Testing via a CGI script
  
  Another method is to invoke a CGI script which dumps the server's
  environment.
  
  I assume that you have configured the server so that scripts running
  under location I</perl/> are handled by the C<Apache::Registry>
  handler and that you have the C<PerlSendHeader> directive set to
  C<On>.
  
  Copy and paste the script below (no need for a shebang line!).  Let's
  say you name it I<test.pl>, save it at the root of the CGI scripts and
  CGI root is mapped directly to the I</perl> location of your server.
  
    print "Content-type: text/plain\r\n\r\n";
    print "Server's environment\n";
    foreach ( keys %ENV ) {
        print "$_\t$ENV{$_}\n";
    }
  
  Make it readable and executable by server (you may need to tune these
  permissions on a public host):
  
    % chmod a+rx test.pl
  
  Now fetch the URL C<http://www.example.com:8080/perl/test.pl> (replace
  8080 with the port your mod_perl enabled server is listening to).  You
  should see something like this (the output has been edited):
  
    SERVER_SOFTWARE Apache/1.3.10-dev (Unix) mod_perl/1.21_01-dev
    GATEWAY_INTERFACE       CGI-Perl/1.1
    DOCUMENT_ROOT   /home/httpd/docs
    REMOTE_ADDR     127.0.0.1
    [more environment variables snipped]
    MOD_PERL        mod_perl/1.21_01-dev
    [more environment variables snipped]
  
  If you see the that the value of C<GATEWAY_INTERFACE> is
  C<CGI-Perl/1.1> everything is OK. 
  
  If there is an error you might have to add a shebang line
  C<#!/usr/bin/perl> as a first line of the CGI script and then try it
  again. If you see:
  
    GATEWAY_INTERFACE       CGI/1.1
  
  it means that you have configured this location to run under mod_cgi
  and not mod_perl.
  
  Also note that there is a C<MOD_PERL> environment variable if you run
  under a mod_perl handler, it's set to the I<mod_perl/x.xx> string,
  where I<x.xx> is the version number of mod_perl.
  
  Based on this difference you can write code like this:
  
    BEGIN {
        # perl5.004 or better is a must under mod_perl
      require 5.004 if $ENV{MOD_PERL};
    }
  
  You might wonder why in the world you would need to know what handler
  you are running under.  Well, for example you will want to use
  C<Apache::exit()> and not C<CORE::exit()> in your modules, but if you
  think that your script might be used in both environments (mod_cgi and
  mod_perl) you will have to override the C<exit()> subroutine and to
  make decision what method to use at the runtime.  
  
  Note that if you run scripts under the C<Apache::Registry> handler, it
  takes care of overriding the C<exit()> call for you, so it's not an
  issue.  For reasons and implementations see: L<Terminating requests
  and processes, exit()
  function|porting/Terminating_requests_and_process> and also L<Writing
  Mod Perl scripts and Porting plain CGIs to
  it|porting/>.
  
  =head2 Testing via lwp-request
  
  Yet another one.  Why do I show all these approaches?  While here they
  serve a very simple purpose, they can be helpful in other situations.
  
  Assuming you have the C<libwww-perl> (C<LWP>) package installed (you will
  need it installed in order to pass mod_perl's C<make test> anyway):
  
    % lwp-request -e -d http://www.example.com
  
  Will show you all the headers.  The C<-d> option disables printing the
  response content.
  
    % lwp-request -e -d http://www.example.com | egrep '^Server:'
  
  To see the server version only.
  
  Use C<http://www.example.com:port_number> if your server is listening
  to a port other than port 80.
  
  
  =head1 General Notes
  
  =head2 Is it possible to run mod_perl enabled Apache as suExec?
  
  The answer is B<No>. The reason is that you can't I<"suid"> a part of
  a process.  mod_perl lives inside the Apache process, so its UID and
  GID are the same as the Apache process.
  
  You have to use mod_cgi if you need this functionality.
  
  Another solution is to use a crontab to call some script that will
  check whether there is something to do and will execute it. The
  mod_perl script will be able to create and update this todo list.
  
  =head2 Should I Rebuild mod_perl if I have Upgraded Perl?
  
  Yes, you should. You have to rebuild the mod_perl enabled server since
  it has a hard-coded C<@INC> variable.  This points to the old Perl and
  it is probably linked to an old C<libperl> library.  If for some
  reason you need to keep the old Perl version around you can modify
  C<@INC> in the startup script, but it is better to build afresh to
  save you getting into a mess.
  
  
  =head2 Perl installation requirements
  
  Make sure you have Perl installed!  The latest stable version if
  possible.  Minimum perl 5.004!  If you don't have it, install it.
  Follow the instructions in the distribution's C<INSTALL> file.
  
  During the configuration stage (while running C<./Configure>), to be
  able to dynamically load Perl module extensions, make sure you answer
  C<YES> to the question:
  
    Do you wish to use dynamic loading? [y]
  
  
  
  =head2 mod_auth_dbm nuances
  
  If you are a C<mod_auth_dbm> or C<mod_auth_db> user you may need to
  edit Perl's C<Config> module.  When Perl is configured it attempts to
  find libraries for ndbm, gdbm, db, etc., for the DB*_File modules.
  By default, these libraries are linked with Perl and remembered by the
  C<Config> module.  When mod_perl is configured with apache, the
  C<ExtUtils::Embed> module requires these libraries to be linked with
  httpd so Perl extensions will work under mod_perl.  However, the order
  in which these libraries are stored in B<Config.pm> may confuse
  C<mod_auth_db*>.  If C<mod_auth_db*> does not work with mod_perl, take
  a look at the order with the following command:
  
   % perl -V:libs
  
  Here's an example:
  
   libs='-lnet -lnsl_s -lgdbm -lndbm -ldb -ldld -lm -lc -lndir -lcrypt';
  
  If C<-lgdbm> or C<-ldb> is before C<-lndbm> (as it is in the example)
  edit I<Config.pm> and move C<-lgdbm> and C<-ldb> to the end of the
  list.  Here's how to find I<Config.pm>:
  
   % perl -MConfig -e 'print "$Config{archlibexp}/Config.pm\n"'
  
  Under Solaris, another solution for building
  Apache/mod_perl+mod_auth_dbm is to remove the DBM and NDBM "emulation"
  from I<libgdbm.a>.  It seems that Solaris already provides its own DBM
  and NDBM, and in our installation we found there's no reason to build
  GDBM with them.
  
  In our Makefile for GDBM, we changed
  
    OBJS = $(DBM_OF) $(NDBM_OF) $(GDBM_OF)
  
  to
  
    OBJS = $(GDBM_OF)
  
  Rebuild libgdbm before Apache/mod_perl.  
  
  
  
  =head2 Stripping Apache to make it almost a Perl-server
  
  Since most of the functionality that various apache mod_* modules
  provide is implemented in the C<Apache::{*}> Perl modules, it was reported
  that one can build an Apache server with mod_perl only.  If you can
  reduce the requirements to whatever mod_perl can handle, you can
  eliminate almost every other module.  Then basically you will have a
  Perl-server, with C code to handle the tricky HTTP bits.  The only
  module you will need to leave in is C<mod_actions>.
  
  
  =head2 Saving the config.status Files with mod_perl, php, ssl and Other 
Components
  
  Typically, when building the bloated Apache that sits behind Squid or
  whatever, you need mod_perl, php, mod_ssl and the rest.  As you
  install each they typically overwrite each other's C<config.status>
  files.  Save them after each step, so you will be able to reuse them
  later.
  
  =head2 What Compiler Should Be Used to Build mod_perl?
  
  All Perl modules that use C extensions must be compiled using the same
  compiler that your copy of Perl was built with.
  
  When you run C<perl Makefile.PL>, a I<Makefile> is created.  This
  I<Makefile> includes the same compilation options that were used to
  build Perl itself.  They are stored in the I<Config.pm> module and can
  be displayed with the C<Perl -V> command.  All these options are
  re-applied when compiling Perl modules.
  
  If you use a different compiler to build Perl extensions, chances are
  that the options that a different compiler uses won't be the same, or
  they might be interpreted in a completely different way.  So the code
  either won't compile or it will dump core when run or maybe it will
  behave in most unexpected ways.
  
  Since mod_perl uses Perl, Apache and third party modules, and they all
  work together, it's essential to use the same compiler while building
  each of the components.
  
  You shouldn't worry about this when compiling Perl modules since Perl
  will choose what's right automatically.  Unless you override things.
  If you do that, you are on your own...
  
  If you compile a non-Perl component separately, you should make sure
  to use the same compiler and the same options used to build Perl.
  Hint: Take a look at the I<Config.pm> module or the output of C<perl
  -V>.
  
  
  
  
  =head1 OS Related Notes
  
  =over 3
  
  =item *
  
  Gary Shea E<lt>shea (at) xmission.comE<gt> discovered a nasty BSDI bug
  (seen in versions 2.1 and 3.0) related to dynamic loading and found
  two workarounds:
  
  It turns out that they use C<argv[0]> to determine where to find the
  link tables at run-time, so if a program either changes C<argv[0]>, or
  does a chdir() (like Apache!) it can easily confuse the dynamic
  loader.  The short-term solutions to the problem are simple.  Either
  of the following will work:
  
  1) Call httpd with a full path, e.g. /opt/www/bin/httpd
  
  2) Put the httpd you wish to run in a directory in your PATH I<before>
  any other directory containing a version of httpd, then call it as
  'httpd'.  Don't use a relative path!
  
  =back
  
  
  
  
  =head1 Pros and Cons of Building mod_perl as DSO
  
  On modern Unix derivatives there is a nifty mechanism usually called
  dynamic linking/loading of Dynamic Shared Objects (DSO), which
  provides a way to build a piece of program code in a special format
  for loading in at run-time into the address space of an executable
  program.
  
  As of Apache 1.3, the configuration system supports two optional
  features for taking advantage of the modular DSO approach: compilation
  of the Apache core program into a DSO library for shared usage and
  compilation of the Apache modules into DSO files for explicit loading
  at run-time.
  
  Should you use this method? Read the pros and cons and decide for
  yourself.
  
  Pros:
  
  =over 
  
  =item *
  
  The server package is more flexible at run-time because the actual
  server process can be assembled at run-time via C<LoadModule>
  I<httpd.conf> configuration commands instead of I<Configuration>
  C<AddModule> commands at build-time.  For instance this way one is
  able to run different server instances (standard & SSL version, with
  and without mod_perl) with only one Apache installation.
  
  =item *
  
  The server package can be easily extended with third-party modules
  even after installation.  This is at least a great benefit for vendor
  package maintainers who can create an Apache core package and
  additional packages containing extensions like PHP4, mod_perl,
  mod_fastcgi, etc.
  
  =item *
  
  Easier Apache module prototyping because with the DSO/apxs pair you
  can both work outside the Apache source tree and only need an S<apxs
  -i> command followed by an S<apachectl restart> to bring a new version
  of your currently developed module into the running Apache server.
  
  
  =back
  
  Cons:
  
  =over 
  
  =item *
  
  The DSO mechanism cannot be used on every platform because not all
  operating systems support dynamic loading of code into the address
  space of a program.
  
  =item *
  
  The server starts up approximately 20% slower because of the
  symbol resolving overhead the Unix loader now has to do.
  
  =item *
  
  The server runs approximately 5% slower on some platforms because
  position independent code (PIC) sometimes needs complicated assembler
  tricks for relative addressing which are not necessarily as fast as
  absolute addressing.
  
  =item *
  
  Because DSO modules cannot be linked against other DSO-based libraries
  (ld -lfoo) on all platforms (for instance a.out-based platforms
  usually don't provide this functionality while ELF-based platforms do)
  you cannot use the DSO mechanism for all types of modules. Or in other
  words, modules compiled as DSO files are restricted to only use
  symbols from the Apache core, from the C library (libc) and all other
  dynamic or static libraries used by the Apache core, or from static
  library archives (libfoo.a) containing position independent code. The
  only way you can use other code is to either make sure the Apache core
  itself already contains a reference to it, loading the code yourself
  via dlopen() or enabling the SHARED_CHAIN rule while building Apache
  when your platform supports linking DSO files against DSO libraries.
  
  =item *
  
  Under some platforms (many SVR4 systems) there is no way to force the
  linker to export all global symbols for use in DSO's when linking the
  Apache httpd executable program. But without the visibility of the
  Apache core symbols no standard Apache module could be used as a
  DSO. The only workaround here is to use the SHARED_CORE feature
  because this way the global symbols are forced to be exported. As a
  consequence the Apache src/Configure script automatically enforces
  SHARED_CORE on these platforms when DSO features are used in the
  Configuration file or on the configure command line.
  
  =back
  
  
  
  =cut
  
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/intro.pod
  
  Index: intro.pod
  ===================================================================
  =head1 NAME
  
  Introduction. Incentives. Credits.
  
  =head1 What is mod_perl
  
  The Apache/Perl integration project brings together the full power of
  the Perl programming language and the Apache HTTP server. With
  mod_perl it is possible to write Apache modules entirely in Perl,
  letting you easily do things that are more difficult or impossible in
  regular CGI programs, such as running sub requests. In addition, the
  persistent Perl interpreter embedded in the server saves the overhead
  of starting an external interpreter, i.e. the penalty of Perl start-up
  time.  And not the least important feature is code caching, where
  modules and scripts are loaded and compiled only once, and for the
  rest of the server's life they are served from the cache. Thus the
  server spends its time only running already loaded and compiled code,
  which is very fast.
  
  The primary advantages of mod_perl are power and speed. You have full
  access to the inner workings of the web server and can intervene at
  any stage of request-processing. This allows for customized processing
  of (to name just a few of the phases) URI-E<gt>filename translation,
  authentication, response generation, and logging. There is very little
  run-time overhead. In particular, it is not necessary to start a
  separate process, as is often done with web-server extensions. The
  most wide-spread such extension, the Common Gateway Interface (CGI),
  can be replaced entirely with Perl code that handles the response
  generation phase of request processing.  mod_perl includes two general
  purpose modules for this purpose: C<Apache::Registry>, which can
  transparently run existing perl CGI scripts and C<Apache::PerlRun>,
  which does a similar job but allows you to run "dirtier" (to some
  extent) scripts.
  
  You can configure your httpd server and handlers in Perl (using
  C<PerlSetVar>, and E<lt>PerlE<gt> sections). You can even define your
  own configuration directives.
  
  Many people ask "How much of a performance improvement does mod_perl
  give?" Well, it all depends on what you are doing with mod_perl and
  possibly who you ask. Developers report speed boosts from 200% to
  2000%. The best way to measure is to try it and see for yourself! (See
  http://perl.apache.org/tidbits.html and
  http://perl.apache.org/stories/ for the facts.)
  
  
  =head2 mod_cgi
  
  When you run your CGI scripts by using a configuration like this:
  
    ScriptAlias /cgi-bin/ /home/httpd/cgi-bin/
  
  you run it under a mod_cgi handler, you never define it
  explicitly. Apache does all the configuration work behind the scenes,
  when you use a ScriptAlias.
  
  By the way, don't confuse C<ScriptAlias> with the C<ExecCGI>
  configuration option, which we enable so that the script will be
  executed rather than returned as a plain text file. For example for
  mod_perl and C<Apache::Registry> you would use a configuration like:
  
    <Location /perl>
      SetHandler perl-script
      PerlHandler Apache::Registry
      Options ExecCGI
      PerlSendHeader On
    </Location>
  
  =head2 C API
  
  META: complete
  
  =head2 Perl API
  
  META: complete
  
  =head2 Apache::Registry
  
  From the viewpoint of the Perl API, C<Apache::Registry> is simply
  another handler that's not conceptually different from any other
  handler. C<Apache::Registry> reads in the script file, compiles,
  executes it and stores into the cache.  Since the perl interpreter
  keeps running from child process' creation to its death, any code
  compiled by the interpreter is kept in memory until the child dies.
  
  To prevent script name collisions, C<Apache::Registry> creates a
  unique key for each cached script by prepending C<Apache::ROOT::> to
  the mangled path of the script's URI.  This key is actually the
  package name that the script resides in. So if you have requested a
  script C</perl/project/test.pl>, the scripts would be wrapped in code
  which starts with a package declaration of:
  
    package Apache::ROOT::perl::project::test_e2pl;
  
  C<Apache::Registry> also stores the script's last modification time.
  Everytime the script changes, the cached code is discarded and
  recompiled using the modified source. However, it doesn't check the
  modification times of any of the perl libraries the script might use.
  
  C<Apache::Registry> overrides C<CORE::exit()> with C<Apache::exit()>,
  so CGI scripts that use C<exit()> will run correctly. We will talk
  about all these details in depth later.
  
  The last thing C<Apache::Registry> does, is emulation of mod_cgi's
  environment variables, like C<$ENV{SERVER_NAME}>, C<$ENV{REMOTE_USER}>
  and so on. B<PerlSetupEnv Off> disables this feature which saves some
  memory and CPU cycles.
  
  From the viewpoint of the programmer, there is almost no difference
  between running a script as a plain CGI script under mod_cgi and
  running it under mod_perl.  There is however a great speed
  improvement, but at the expense of much heavier memory usage (there is
  no free lunch :).
  
  When they run under  mod_cgi, your CGI scripts are loaded each time 
  they are called and then they exit.  Under mod_perl they are loaded 
  once and cached.  This gives a big performance boost.  But because the
  code is cached and doesn't exit, it won't cleanup memory as it would
  under mod_cgi.  This can have unexpected effects.
  
  Your scripts will be recompiled and reloaded by mod_perl when
  it detects that you have changed them, but remember that any 
  libraries that your scripts might require() or use() will not 
  be recompiled when they are changed.  You will have to take
  action yourself to ensure that they are recompiled.
  
  Of course the guide will answer all these issues in depth.
  
  Let's see what happens to your script when it's being executed under
  C<Apache::Registry>. If we take the simplest code of (URI
  C</perl/project/test.pl>)
  
    print "Content-type: text/html\n\n";
    print "It works\n";
  
  C<Apache::Registry> will convert it into the following:
  
    package Apache::ROOT::perl::project::test_e2pl;
    use Apache qw(exit);
    sub handler {
      print "Content-type: text/html\n\n";
      print "It works\n";
    }
  
  The first line provides a unique namespace for the code to use, and a
  unique key by which the code can be referenced from the cache.
  
  The second line imports C<Apache::exit> which over-rides perl's
  built-in C<exit>.
  
  The C<sub handler> subroutine is wrapped around your code. By default
  (i.e. if you do not specify an alternative), when you use mod_perl and
  your code's URI is called, mod_perl will seek to execute the URI's
  associated C<handler> subroutine.
  
  META: Complete
  
  =head2 Apache::PerlRun
  
  META: Complete
  
  =head1 What will you learn
  
  This document was written in an effort to help you start using
  Apache's mod_perl extension as quickly and easily as possible. It
  includes information about the installation and configuration of both
  Perl and the Apache web server and delves deeply into the issues of
  writing and porting existing Perl scripts to run under mod_perl. Note
  that it does not attempt to enter the big world of using the Perl API
  or C API.  You will find pointers to coverage of these topics in the
  L<Getting Help and Further Learning|help/> section of this
  document. This guide tries to cover the most of the
  C<Apache::Registry> and C<Apache::PerlRun> modules. Along with
  mod_perl related topics, there are many more issues related to
  administering Apache servers, debugging scripts, using databases,
  mod_perl related Perl, code snippets and more. The L<Guide's
  Overview|start/> will help you to find your way through the guide.
  
  It is assumed that you know at least the basics of building and
  installing Perl and Apache. (If you do not, just read the INSTALL documents
  which are part of the distribution of each package.) However, in
  this guide you will find specific Perl and Apache installation and
  configuration notes, which will help you successfully complete the
  mod_perl installation and get the server running in a short time.
  
  If after reading this guide and the other documents listed in
  L<Getting Help and Further Learning|help/> you feel that your
  questions remain unanswered, you could try asking the apache/mod_perl
  mailing list to help you.  But first try to browse the mailing list
  archive (located at http://mathforum.org/epigone/modperl ). Often you
  will find the answer to your question by searching the mailing list
  archive, since most questions have already been asked and answered
  already!  If you ignore this advice, do not be surprised if your
  question goes unanswered - it bores people when they're asked to
  answer the same question repeatedly - especially if the answer can be
  found in the archive or in the documentation.  This does not mean that
  you should avoid asking questions, just do not abuse the available
  help and B<RTFM> before you call for B<HELP>. (You have certainly
  heard the infamous fable of the shepherd boy and the wolves...) And if
  you do ask questions on the mailing list I<please> make your subject
  line descriptive of the problem, not just "Help" - you're far more
  likely to get replies if people can see the issue you are talking
  about straight away.
  
  If you find incorrect details or mistakes in my grammar, or you want 
  to contribute to this document please feel free to send me an email at
  [EMAIL PROTECTED] .
  
  =head1 High-Profile Sites Running mod_perl
  
  A report prepared by Rex Staples at Thu, 14 Oct 1999:
  
  =over
  
  =item *
  
  Macromedia
  
  4,273,000 unique visitors/month Aug-1999
  
  http://www.macromedia.com
  
  Apache/1.3.4 (Unix) mod_perl/1.18 on Solaris
  
  =item *
  
  ValueClick: Results-based advertising network
  
  80 million page views/day May 2000 using about 45 machines
  
  http://valueclick.com
  
  Apache/1.3.9-dev (Unix) mod_perl/1.21_01 on FreeBSD
  
  =item *
  
  Deja.com
  
  130 million pageviews/month Oct-1999
  
  http://www.deja.com
  
  Apache/1.3b5 mod_perl/1.08 on Linux
  
  =item *
  
  MP3.com, Inc.
  
  77 million page views/month Aug-1999
  
  408,000 unique visitors/day Aug-1999
  
  http://www.mp3.com
  http://www.mp3.com/pr/990914-keymetrics.html
  
  Apache/1.3.4-9 (Unix) mod_perl/1.18-21 on Linux/FreeBSD
  
  =item *
  
  IMDB: Internet Movie Database
  
  1.25 million page views/day Mar-1998
  
  http://www.imdb.com
  
  * They are now an Amazon.com company
  
  Apache/1.3.7-dev (Unix) mod_perl/1.19_01-dev
  
  =item *
  
  Flash.net: Internet Service Provider
  
  1,603,000 unique visitors/month Aug-1999
  
  http://www.flash.net
  http://www.mediametrix.com/TopRankings/TopRankings.html
  
  Apache/1.2.4 mod_perl/1.00 on Solaris
  
  =item *
  
  At Hand Network Yellow Pages
  
  917,000 unique visitors/month Aug-1999
  
  http://www.athand.com
  http://www.mediametrix.com/TopRankings/TopRankings.html
  
  Stronghold/2.3 Apache/1.2.6 (Unix) mod_perl/1.15 on Solaris
  
  =item *
  
  Commissioner.com: Subscription Fantasy Football
  
  12 million page views/day Oct-1999
  
  http://www.commissioner.com
  
  Apache/1.35b mod_perl/1.10 on Linux
  
  =item *
  
  Slashdot: News For Nerds
  
  400,000 page views/day Oct-1999
  
  http://www.slashdot.org
  
  Apache/1.3.6 (Unix) mod_perl/1.21 on Linux
  
  =item *
  
  Hot Bot mail and member web pages:
  
  http://members.hotbot.com
  
  Also widely used on HotWired, WiredNews, Webmonkey, and Suck.com
  
  Apache/1.3.4 (Unix) mod_perl/1.21 on Solaris
  
  =item *
  
  Art Today: subscription clip-art service
  
  250k hits/day
  
  http://www.arttoday.com
  
  Oracle 7 + 1 Sun Ultra w/150GB storage
  Apache/1.3.4 (Unix) mod_perl/1.17 on Solaris
  
  =item *
  
  CMPnet: a technology information network
  
  500k hits/day
  
  http://www.cmpnet.com
  
  Apache/1.3.9 (Unix) mod_perl/1.16
  
  
  =back
  
  
  =head1 References and Acknowledgments
  
  I have used the following references while writing this guide:
  
  =over 4
  
  =item *
  
  B<mod_perl FAQ> by Frank Cringle at http://perl.apache.org/faq/ .
  
  =item *
  
  B<mod_perl performance tuning guide> by Vivek Khera at
  http://perl.apache.org/tuning/ .
  
  =item *
  
  B<mod_perl plugin reference guide> by Doug MacEachern at
  http://perl.apache.org/src/mod_perl.html .
  
  =item *
  
  B<Quick guide for moving from CGI to mod_perl> at
  http://perl.apache.org/dist/cgi_to_mod_perl.html .
  
  =item *
  
  B<mod_perl_traps, common traps and solutions for mod_perl users> at
  http://perl.apache.org/dist/mod_perl_traps.html .
  
  =item *
  
  B<mod_perl mailing list emails>. Answers to some of the questions posted
  to Apache/Perl mailing list. (To send email to [EMAIL PROTECTED])
  
  =item *
  
  B<My personal experience with mod_perl>.
  
  =back
  
  I have quoted many snippets of information from FAQs and emails, but I
  have not credited each quote in the guide individually.  I did not
  mean to take the credit for myself, it's just that I tried to keep
  track of names, and became lost, so instead of scattering credits
  throughout the Guide I have gathered them all together here.  If you
  want your name to show up under your original quote, please tell me
  and I'll add it for you.
  
  Major contributors:
  
  =over 4
  
  =item *
  
  B<Doug MacEachern>. A large part of this guide is built upon his email
  replies to users questions.
  
  =item *
  
  B<Frank Cringle>. Parts of his mod_perl FAQ have been used in this guide.
  
  =item *
  
  B<Vivek Khera>. For his mod_perl performance tuning guide. And lots of
  useful comments on the list that made into the guide.
  
  =item *
  
  B<Steve Reppucci>, who did a thorough review of the stuff I wrote.  He
  fixed lots of spelling and grammar errors, and made the guide readable
  to English speakers :)
  
  =item *
  
  B<Eric Cholet>, who wrote complete sections for the guide, and pointed
  out technical errors in it.
  
  =item *
  
  B<Ken Williams>, who reviewed a lot of stuff in the guide.  Many
  snippets from his emails are included in the guide.
  
  =item * 
  
  B<Matt Sergeant>, who contributed the section "Exception Handling for
  mod_perl" for the perl reference chapter and made many other
  contributions.
  
  =item *
  
  B<Wesley Darlington> for contributing a big section for the scenario
  chapter.
  
  =item *
  
  B<Geoffrey S Young> and B<David Harris> for contributing big sections
  about mod_perl and RPM packages, and providing helpful comments and
  corrections.
  
  =item *
  
  B<Andreas J. Koenig> for contributing his "Correct HTTP headers"
  document.
  
  =item *
  
  B<Ged W. Haywood> for reviewing and fixing the whole guide, providing
  lots of constructive criticisms and helping to reorganize the guide to
  make it more user friendly.
  
  =item *
  
  B<Mark Summerfield> for reviewing and fixing all the guide's chapters,
  improving guide's readability and suggesting useful extensions.
  
  =item *
  
  B<Jeffrey W. Baker> for his "guide to mod_perl database performance"
  and many useful comments on the list that has been reused in the
  guide's material.
  
  =item *
  
  B<Richard A. Wells> for reviewing and correcting a large part of the
  guide.
  
  =item *
  
  B<Randy Harmon> for rewriting the mod_perl advocacy chapter
  
  =item *
  
  B<Dean Fitz> for reviewing the "Operating System and Hardware Demands"
  chapter.
  
  =back
  
  Credits of course go to ( alphabetically sorted ):
  
  =over
  
  =item * Aaron Johnson
  
  =item * Ajay Shah
  
  =item * Alexander Farber
  
  =item * Andreas J. Koenig
  
  =item * Andreas Piesk
  
  =item * Andrei A. Voropaev
  
  =item * Andrew Ford
  
  =item * Anthony D. Ettinger
  
  =item * Artur Zambrzycki
  
  =item * Ask Bjoern Hansen
  
  =item * Barrie Slaymaker
  
  =item * Bill Moseley
  
  =item * Boris Zentner
  
  =item * Brian Moseley
  
  =item * Carl Hansen
  
  =item * Chad K. Lewis
  
  =item * Chris Nokleberg
  
  =item * Chris Winters
  
  =item * Christof Damian
  
  =item * Christophe Dupre
  
  =item * Cliff Rayman
  
  =item * Craig
  
  =item * Daniel Bohling
  
  =item * Daniel Koch
  
  =item * Daniel W. Burke
  
  =item * Darren Chamberlain
  
  =item * Dave Hodgkinson
  
  =item * Dave Rolsky
  
  =item * David Harris
  
  =item * David Huggins-Daines
  
  =item * David Landgren
  
  =item * David Mitchell
  
  =item * DeWitt Clinton
  
  =item * Dean Fitz
  
  =item * Doug Bagley
  
  =item * Doug Kyle
  
  =item * Drew Taylor
  
  =item * Ed Park
  
  =item * Ed Phillips
  
  =item * Edmund Mergl
  
  =item * Edwin Pratomo
  
  =item * Eric Cholet
  
  =item * Eric Strovink
  
  =item * Evan A. Zacks
  
  =item * Ewan Edwards
  
  =item * Frank Schoeters
  
  =item * Garr Updegraff
  
  =item * Ged Haywood
  
  =item * Geoff Crawshaw
  
  =item * Geoffrey Young
  
  =item * Gerald Richter
  
  =item * Gerd Knops
  
  =item * Glenn
  
  =item * Greg Cope
  
  =item * Greg Stark
  
  =item * Gunther Birznieks
  
  =item * Hailei Dai
  
  =item * Henrique Pantarotto
  
  =item * Honza Pazdziora
  
  =item * Howard Jones
  
  =item * Hunter Monroe
  
  =item * Ilya Obshadko
  
  =item * Ime Smits
  
  =item * Issac Goldstand
  
  =item * James Furness
  
  =item * James G Smith
  
  =item * James W Walden
  
  =item * Jan Peter Hecking
  
  =item * Jason Bodnar
  
  =item * Jason Rhinelander
  
  =item * Jauder Ho
  
  =item * Jay J
  
  =item * Jean-Louis Guenego
  
  =item * Jeff Chan
  
  =item * Jeff Rowe
  
  =item * Jeffrey W. Baker
  
  =item * Jie Gao
  
  =item * Joao Fonseca
  
  =item * Joe Schaefer
  
  =item * Joe Slag
  
  =item * John Armstrong
  
  =item * John Deighan
  
  =item * John Hyland
  
  =item * John Milton
  
  =item * John Walker
  
  =item * Jon Orwant
  
  =item * Jonathan Peterson
  
  =item * Joshua Chamas
  
  =item * Kavitha
  
  =item * Kees Vonk
  
  =item * Ken Williams
  
  =item * Kenny Gatdula
  
  =item * Kevin Murphy
  
  =item * Kevin Swope
  
  =item * Lance Cleveland
  
  =item * Larry Leszczynski
  
  =item * Leslie Mikesell
  
  =item * Lincoln Stein
  
  =item * Lupe Christoph
  
  =item * Mads Toftum
  
  =item * Marc Lehmann
  
  =item * Marcel Grunauer
  
  =item * Mark Mills
  
  =item * Mark Summerfield
  
  =item * Marko van der Puil
  
  =item * Marshall Dudley
  
  =item * Matt Sergeant
  
  =item * Matthew Darwin
  
  =item * Michael Blakeley
  
  =item * Michael Finke
  
  =item * Michael Hall
  
  =item * Michael Rendell
  
  =item * Michael Schout
  
  =item * Michele Beltrame
  
  =item * Mike Depot
  
  =item * Mike Fletcher
  
  =item * Mike MacKenzie
  
  =item * Mike Miller
  
  =item * Nancy Lin
  
  =item * Nathan Torkington
  
  =item * Nathan Vonnahme
  
  =item * Neil Conway
  
  =item * Nick Tonkin
  
  =item * Oleg Bartunov
  
  =item * Owen Williams
  
  =item * Pascal Eeftinck
  
  =item * Patrick
  
  =item * Paul Cotter
  
  =item * Pavel Shmidt
  
  =item * Perrin Harkins
  
  =item * Peter Galbavy
  
  =item * Peter Haworth
  
  =item * Peter J. Schoenster
  
  =item * Peter Skov
  
  =item * Philip Jacob
  
  =item * Philip Newton
  
  =item * Radu Greab
  
  =item * Ralf Engelschall
  
  =item * Randal L. Schwartz
  
  =item * Randy Harmon
  
  =item * Randy Kobes
  
  =item * Rauznitz Balazs
  
  =item * Rex Staples
  
  =item * Richard A. Wells
  
  =item * Richard Chen
  
  =item * Richard Dice
  
  =item * Richard More
  
  =item * Rick Myers
  
  =item * Robert Mathews
  
  =item * Robin Berjon
  
  =item * Rodger Donaldson
  
  =item * Ron Pero
  
  =item * Roy Nasser
  
  =item * Salve J Nilsen
  
  =item * Scott Fagg
  
  =item * Scott Holdren
  
  =item * Sean Dague
  
  =item * Shane Nay
  
  =item * Stephane Benoit
  
  =item * Stephen Judd
  
  =item * Steve Fink
  
  =item * Steve Reppucci
  
  =item * Steve Willer
  
  =item * Surat Singh Bhati
  
  =item * Terry West
  
  =item * Tim Bunce
  
  =item * Todd Finney
  
  =item * Tom Brown
  
  =item * Tom Christiansen
  
  =item * Tom Hughes
  
  =item * Tom Mornini
  
  =item * Tuomas Salo
  
  =item * Tzvetan Stoyanov
  
  =item * Ulrich Neumerkel
  
  =item * Ulrich Pfeifer
  
  =item * Vivek Khera
  
  =item * Ward Vandewege
  
  =item * Wesley Darlington
  
  =item * Will Trillich
  
  =item * Yann Kerherv�
  
  =item * Yann Ramin
  
  =item * 
  
  =item * Did I miss your name? Tell me!
  
  =back
  
  I want to thank all the people who donated their time and efforts to
  make this amazing idea of mod_perl a reality. This includes Doug
  MacEachern, the author of mod_perl, and all the developers who
  contributed bug patches, modules and help. And of course the numerous
  unseen users around the world who help to promote mod_perl and to make
  it a better tool.
  
  =cut
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/modules.pod
  
  Index: modules.pod
  ===================================================================
  =head1 NAME
  
  Apache::* modules
  
  =head1 Apache::Session - Maintain session state across HTTP requests
  
  This module provides the Apache/mod_perl user with a mechanism for
  storing persistent user data in a global hash, which is independent of
  the underlying storage mechanism.  Currently you can choose from these
  storage mechanisms C<Apache::Session::DBI>, C<Apache::Session::Win32>,
  C<Apache::Session::File>, C<Apache::Session::IPC>.  Read the man page
  of the mechanism you want to use for a complete reference.
  
  C<Apache::Session> provides persistence to a data structure.  The data
  structure has an ID number, and you can retrieve it by using the ID
  number.  In the case of Apache, you would store the ID number in a
  cookie or the URL to associate it with one browser, but the method of
  dealing with the ID is completely up to you.  The flow of things is
  generally:
  
    Tie a session to Apache::Session.
    Get the ID number.
    Store the ID number in a cookie.
    End of Request 1.
  
    (time passes)
  
    Get the cookie.
    Restore your hash using the ID number in the cookie.
    Use whatever data you put in the hash.
    End of Request 2.
  
  Using C<Apache::Session> is easy: simply tie a hash to the session
  object, stick any data structure into the hash, and the data you put
  in automatically persists until the next invocation.  Here is an
  example which uses cookies to track the user's session.
  
    # pull in the required packages
    use Apache::Session::DBI;
    use Apache;
    
    use strict;
    
    # read in the cookie if this is an old session
    my $r = Apache->request;
    my $cookie = $r->header_in('Cookie');
    $cookie =~ s/SESSION_ID=(\w*)/$1/;
    
    # create a session object based on the cookie we got from the
    # browser, or a new session if we got no cookie
    my %session;
    tie %session, 'Apache::Session::DBI', $cookie,
        {DataSource => 'dbi:mysql:sessions',
         UserName   => $db_user,
         Password   => $db_pass
        };
    
    # might be a new session, so lets give them their cookie back
    my $session_cookie = "SESSION_ID=$session{_session_id};";
    $r->header_out("Set-Cookie" => $session_cookie);
  
  After setting this up, you can stick anything you want into
  C<%session> (except file handles and code references and using
  I<_session_id>), and it will still be there when the user invokes the
  next page.
  
  It is possible to write an Apache authentication handler using
  C<Apache::Session>.  You can put your authentication token into the
  session.  When a user invokes a page, you open their session, check to
  see if they have a valid token, and authenticate or forbid based on
  that.
  
  By way of comparison note that IIS's sessions are only valid on the
  same web server as the one that issued the session.
  C<Apache::Session>'s session objects can be shared amongst a farm of
  many machines running different operating systems, including even
  Win32.  IIS stores session information in RAM.  C<Apache::Session>
  stores sessions in databases, file systems, or RAM.  IIS's sessions
  are only good for storing scalars or arrays.  C<Apache::Session>'s
  sessions allow you to store arbitrarily complex objects.  IIS sets up
  the session and automatically tracks it for you.  With
  C<Apache::Session>, you setup and track the session yourself.  IIS is
  proprietary.  C<Apache::Session> is open-source.
  C<Apache::Session::DBI> can issue 400+ session requests per second on
  light Celeron 300A running Linux.  IIS?
  
  An alternative to C<Apache::Session> is C<Apache::ASP>, which has
  session tracking abilities.  C<HTML::Embperl> hooks into
  C<Apache::Session> for you.
  
  
  
  
  
  
  
  
  =head1 Apache::DBI - Initiate a persistent database connection
  
  See L<mod_perl and relational
  Databases|databases/Apache_DBI_Initiate_a_persist>
  
  
  
  
  =head1 Apache::Watchdog::RunAway - Hanging Processes Monitor and Terminator
  
  This module monitors hanging Apache/mod_perl processes.  You define
  the time in seconds after which the process is to be counted as
  I<hanging> or I<run away>.
  
  When the process is considered to be I<hanging> it will be killed and the
  event logged in a log file.
  
  Generally you should use the C<amprapmon> program that is bundled with
  this module's distribution package, but you can write your own code
  using the module as well.  See the I<amprapmon> manpage for more 
  information about it.
  
  Note that it requires the C<Apache::Scoreboard> module to work.
  
  Refer to the C<Apache::Watchdog::RunAway> manpage for the
  configuration details.
  
  
  
  
  
  
  =head1 Apache::VMonitor -- Visual System and Apache Server Monitor
  
  C<Apache::VMonitor> is the next generation of
  L<mod_status|debug/mod_status>. It provides all the information
  mod_status provides and much more.
  
  This module emulates the reporting functions of the top(), mount(),
  df() and ifconfig() utilities.  There is a special mode for mod_perl
  processes.  It has visual alert capabilities and a configurable
  I<automatic refresh> mode.  It provides a Web interface, which can be
  used to show or hide all the sections dynamically.
  
  The are two main modes:
  
  =over
  
  =item *
  
  Multi processes mode -- All system processes and information is shown.
  
  =item *
  
  Single process mode -- In-depth information about a single process is shown.
  
  =back
  
  The main advantage of this module is that it reduces the need to
  telnet to the machine in order to monitor it.  Indeed it provides information
  about mod_perl processes that cannot be acquired from telneting to the
  machine.
   
  =head3 Configuration
  
    # Configuration in httpd.conf
    <Location /sys-monitor>
      SetHandler perl-script
      PerlHandler Apache::VMonitor
    </Location>
  
    # startup file or <Perl> section:
    use Apache::VMonitor();
    $Apache::VMonitor::Config{BLINKING} = 0; # Blinking is evil
    $Apache::VMonitor::Config{REFRESH}  = 0;
    $Apache::VMonitor::Config{VERBOSE}  = 0;
    $Apache::VMonitor::Config{SYSTEM}   = 1;
    $Apache::VMonitor::Config{APACHE}   = 1;
    $Apache::VMonitor::Config{PROCS}    = 1;
    $Apache::VMonitor::Config{MOUNT}    = 1;
    $Apache::VMonitor::Config{FS_USAGE} = 1;
    $Apache::VMonitor::Config{NETLOAD}  = 1;
                                  
    @Apache::VMonitor::NETDEVS    = qw(lo eth0);
    $Apache::VMonitor::PROC_REGEX = join "\|", qw(httpd mysql squid);
  
  
  More information is available in the module's extensive manpage.
  
  It requires C<Apache::Scoreboard> and C<GTop> to work.  C<GTop> in
  turn requires the C<libgtop> library but is not available for all
  platforms.  Visit http://www.home-of-linux.org/gnome/libgtop/ to check whether
  your platform/flavor is supported.
  
  
  
  
  
  
  =head1 Apache::GTopLimit - Limit Apache httpd processes
  
  This module allows you to kill off Apache processes if they grow
  too large or if they share too little of their memory.  You can 
  choose to set up the process size limiter to check the process 
  size on every request:
  
  The module is thoroughly explained in the sections: "L<Keeping the
  Shared Memory Limit|performance/Keeping_the_Shared_Memory_Limit>" and
  "L<Limiting the Size of the
  Processes|performance/Limiting_the_Size_of_the_Process>"
  
  
  =head1 Apache::Request (libapreq) - Generic Apache Request Library
  
  This package contains modules for manipulating client request data via
  the Apache API with Perl and C.  Functionality includes:
  
  =over
  
  =item parsing of application/x-www-form-urlencoded data
  
  =item parsing of multipart/form-data
  
  =item parsing of HTTP Cookies
  
  =back
  
  The Perl modules are simply a thin xs layer on top of libapreq, making
  them a lighter and faster alternative to CGI.pm and CGI::Cookie.  See
  the C<Apache::Request> and C<Apache::Cookie> documentation for more
  details and eg/perl/ for examples.
  
  C<Apache::Request> and libapreq are tied tightly to the Apache API,
  to which there is no access in a process running under mod_cgi.
  
  (L<Apache::Request|download/Apache__Request>)
  
  
  
  
  =head1 Apache::RequestNotes - Allow Easy, Consistent Access to Cookie and 
Form Data Across Each Request Phase
  
  C<Apache::RequestNotes> provides a simple interface allowing all
  phases of the request cycle access to cookie or form input parameters
  in a consistent manner.  Behind the scenes, it uses libapreq
  
L<Apache::Request|modules/Apache__Request__libapreq____Generic_Apache_Request_Library>)
  functions to parse request data and puts references to the data in
  C<pnotes()>.
  
  Once the request is past the PerlInit phase, all other phases can have
  access to form input and cookie data without parsing it themselves.
  This relieves some strain, especially when the GET or POST data is
  required by numerous handlers along the way.
  
  See the C<Apache::RequestNotes> manpage for more information.
  
  
  
  
  =head1 Apache::PerlRun - Run unaltered CGI scripts under mod_perl
  
  See L<Apache::PerlRun - a closer
  look|porting/Apache_PerlRun_a_closer_look>.
  
  
  
  
  
  
  
  =head1 Apache::RegistryNG -- Apache::Registry New Generation
  
  C<Apache::RegistryNG> is the same as C<Apache::Registry>, aside from
  using filenames instead of URIs for namespaces.  It also uses an Object 
  Oriented interface.
  
    PerlModule Apache::RegistryNG
    <Location /perl>
      SetHandler perl-script
      PerlHandler Apache::RegistryNG->handler
    </Location>
  
  C<Apache::RegistryNG> inherits from C<Apache::PerlRun>, but the
  handler() is overridden.  Aside from the handler(), the rest of
  C<Apache::PerlRun> contains all the functionality of
  C<Apache::Registry> broken down into several subclass-able methods.
  These methods are used by C<Apache::RegistryNG> to implement the exact
  same functionality of C<Apache::Registry>, using the
  C<Apache::PerlRun> methods.
  
  There is no compelling reason to use C<Apache::RegistryNG> over
  C<Apache::Registry>, unless you want to do add or change the
  functionality of the existing I<Registry.pm>.  For example,
  C<Apache::RegistryBB> (Bare-Bones) is another subclass that skips the
  stat() call performed by C<Apache::Registry> on each request.
  
  
  
  
  =head1 Apache::RegistryBB -- Apache::Registry Bare Bones 
  
  It works just like C<Apache::Registry>, but does not test the x bit
  (-x file test for executable mode), only compiles the file once (no
  stat() call is made per request), skips the C<OPT_EXECCGI> checks and
  does not C<chdir()> into the script parent directory.  It uses the
  Object Oriented interface.
  
  Configuration:
  
    PerlModule Apache::RegistryBB
    <Location /perl>
      SetHandler perl-script
      PerlHandler Apache::RegistryBB->handler
    </Location>
  
  
  
  
  
  
  =head1 Apache::OutputChain -- Chain Stacked Perl Handlers
  
  Apache::OutputChain was written as a way of exploring the possibilities 
  of stacked handlers in mod_perl.  It ties STDOUT to an object which
  catches the output and makes it easy to build a chain of modules that
  work on output data stream.
  
  Examples of modules that are build on this idea are
  C<Apache::SSIChain>, C<Apache::GzipChain> and C<Apache::EmbperlChain>
  -- the first processes the SSI's in the stream, the second compresses
  the output on the fly, the last adds Embperl processing.
  
  The syntax goes like this:
  
    <Files *.html>
      SetHandler perl-script
      PerlHandler Apache::OutputChain Apache::SSIChain Apache::PassHtml
    </Files>
  
  The modules are listed in the reverse order of their execution -- here
  the C<Apache::PassHtml> module simply picks a file's content and sends
  it to STDOUT, then it's processed by C<Apache::SSIChain>, which sends
  its output to STDOUT again. Then it's processed by
  C<Apache::OutputChain>, which finally sends the result to the browser.
  
  An alternative to this approach is C<Apache::Filter>, which has a more
  natural I<forward> configuration order and is easier to interface with
  other modules.
  
  It works with C<Apache::Registry> as well, for example:
  
    Alias /foo /home/httpd/perl/foo
    <Location /foo>
      SetHandler "perl-script"
      Options +ExecCGI
      PerlHandler Apache::OutputChain Apache::GzipChain Apache::Registry
    </Location>
  
  It's really a regular C<Apache::Registry> setup, except for the added
  modules in the PerlHandler line.
  
  
(L<Apache::GzipChain|modules/Apache__GzipChain___compress_HTML__or_anything__in_the_OutputChain>
 allows to
  compress the output on the fly.)
  
  
  =head1 Apache::Filter - Alter the output of previous handlers
  
  META: to be written (actually summarized the info from Apache::Filter
  manpage)
  
  
  
  =head1 Apache::GzipChain - compress HTML (or anything) in the OutputChain
  
  Have you ever served a huge HTML file (e.g. a file bloated with
  JavaScript code) and wondered how could you send it compressed, thus
  dramatically cutting down the download times?  After all Java applets
  can be compressed into a jar and benefit from faster download times.
  Why can't we do the same with plain ASCII (HTML, JS etc.)?  ASCII text
  can often be compressed by a factor of 10.
  
  C<Apache::GzipChain> comes to help you with this task.  If a client
  (browser) understands C<gzip> encoding, this module compresses the
  output and sends it downstream.  The client decompresses the data upon
  receipt and renders the HTML as if it were fetching plain HTML.
  
  For example to compress all html files on the fly, do this:
  
    <Files *.html>
      SetHandler perl-script
      PerlHandler Apache::OutputChain Apache::GzipChain Apache::PassFile
    </Files>
  
  Remember that it will work only if the browser claims to accept
  compressed input, by setting the C<Accept-Encoding> header.
  C<Apache::GzipChain> keeps a list of user-agents, thus it also looks
  at the C<User-Agent> header to check for browsers known to accept
  compressed output.
  
  For example if you want to return compressed files which will in
  addition pass through the Embperl module, you would write:
  
    <Location /test>
      SetHandler perl-script
      PerlHandler Apache::OutputChain Apache::GzipChain Apache::EmbperlChain 
Apache::PassFile
    </Location>
  
  Hint: Watch the I<access_log> file to see how many bytes were actually
  sent, and compare that with the bytes sent using a regular
  configuration.
  
  (See also C<Apache::GzipChain>).
  
  Notice that the rightmost PerlHandler must be a content producer.
  Here we are using C<Apache::PassFile> but you can use any module which
  creates output.
  
  
  
  =head1 Apache::Gzip - Auto-compress web files with Gzip
  
  Similar to C<Apache::GzipChain> but works with C<Apache::Filter>.
  
  This configuration:
  
    PerlModule Apache::Filter
    <Files ~ "*\.html">
      SetHandler perl-script
      PerlSetVar Filter On
      PerlHandler Apache::Gzip
    </Files>
  
  will send all the I<*.html> files compressed if the client accepts the
  compressed input.
  
  And this one:
  
    PerlModule Apache::Filter
    Alias /home/http/perl /perl
    <Location /perl>
      SetHandler perl-script
      PerlSetVar Filter On
      PerlHandler Apache::RegistryFilter Apache::Gzip
    </Location>
  
  will compress the output of the C<Apache::Registry> scripts. Yes, you
  should use C<Apache::RegistryFilter> instead of C<Apache::Registry>
  for it to work.
  
  You can use as many filters as you want:
  
    PerlModule Apache::Filter
    <Files ~ "*\.blah">
      SetHandler perl-script
      PerlSetVar Filter On
      PerlHandler Filter1 Filter2 Apache::Gzip
    </Files>
  
  You can test that it works by either looking at the size of the
  response in the I<access.log> or by telnet:
  
    panic% telnet localhost 8000
    Trying 127.0.0.1
    Connected to 127.0.0.1
    Escape character is '^]'.
    GET /perl/test.pl HTTP 1.1
    Accept-Encoding: gzip
    User-Agent: Mozilla
  
  And you will get the data compressed if configured correctly.
  
  
  =head1 Apache::PerlVINC - Allows Module Versioning in Location blocks and 
Virtual Hosts
  
  With this module you can have different C<@INC> for different virtual
  hosts, locations and equivalent configuration blocks.
  
  Suppose two versions of C<Apache::Status> are being hacked on the same
  server.  In this configuration:
  
    PerlModule Apache::PerlVINC
    
    <Location /status-dev/perl>
      SetHandler       perl-script
      PerlHandler      Apache::Status
    
      PerlINC          /home/httpd/dev/lib
      PerlFixupHandler Apache::PerlVINC
      PerlVersion      Apache/Status.pm
    </Location>
    
    <Location /status/perl>
      SetHandler       perl-script
      PerlHandler      Apache::Status
    
      PerlINC          /home/httpd/prod/lib
      PerlFixupHandler Apache::PerlVINC
      PerlVersion      Apache/Status.pm
    </Location>
  
  The C<Apache::PerlVINC> is loaded and then two different locations are
  specified for the same handler C<Apache::Status>, whose development
  version resides in I</home/httpd/dev/lib> and production version in
  I</home/httpd/prod/lib>.
  
  In case the I</status/perl> request is issued (the latter
  configuration section), the fixup handler will internally do:
  
    delete $INC{Apache/Status.pm};
    unshift @INC, /home/httpd/prod/lib;
    require "Apache/Status.pm";
  
  which will load the production version of the module and it'll be used
  to process the request. If on the other hand if the request to the
  I</status-dev/perl> location will be issued, as configured in the
  former configuration section, a similar thing will happen, but a
  different path (I</home/httpd/dev/lib>) will be prepended to C<@INC>:
  
    delete $INC{Apache/Status.pm};
    unshift @INC, /home/httpd/dev/lib;
    require "Apache/Status.pm";
  
  It's important to be aware that a changed C<@INC> is effective only
  inside the C<E<lt>LocationE<gt>> or a similar configuration directive.
  C<Apache::PerlVINC> subclasses the C<PerlRequire> directive, marking
  the file to be reloaded by the fixup handler, using the value of
  C<PerlINC> for C<@INC>.  That's local to the fixup handler, so you
  won't actually see C<@INC> changed in your script.
  
  In addition the modules with different versions can be unloaded at the
  end of request, using the C<PerlCleanupHandler> handler:
  
    <Location /status/perl>
      SetHandler         perl-script
      PerlHandler        Apache::Status
    
      PerlINC            /home/httpd/prod/lib
      PerlFixupHandler   Apache::PerlVINC
      PerlCleanupHandler Apache::PerlVINC
      PerlVersion        Apache/Status.pm
    </Location>
  
  Also notice that C<PerlVersion> effect things differently depending on
  where it was placed. If it was placed inside a C<E<lt>LocationE<gt>>
  or a similar block section, the files will only be reloaded on
  requests to that location. If it was placed in a server section, all
  requests to the server or virtual hosts will have these files
  reloaded.
  
  As you can guess, this module slows the response time down because it
  reloads some modules on a per-request basis.  Hence, this module
  should only be used in a development environment, not a production
  one.
  
  
  
  
  =head1 Apache::LogSTDERR
  
  When Apache's builtin syslog support is used, the stderr stream is
  redirected to C</dev/null>.  This means that Perl warnings, any
  messages from C<die()>, C<croak()>, etc., will also end up in the
  black hole.  The I<HookStderr> directive will hook the stderr stream
  to a file of your choice, the default is shown in this example:
  
   PerlModule Apache::LogSTDERR
   HookStderr logs/stderr_log
  
  [META: see
  http://mathforum.org/epigone/modperl/vixquimwhen ]
  
  
  
  
  =head1 Apache::RedirectLogFix
  
  Because of the way mod_perl handles redirects, the status code is
  not properly logged.  The C<Apache::RedirectLogFix> module works around
  that bug until mod_perl can deal with this.  All you have to do is to
  enable it in the I<httpd.conf> file.
  
    PerlLogHandler Apache::RedirectLogFix
  
  For example, you will have to use it when doing:
  
    $r->status(304);
  
  and do some manual header sending, like this:
  
    $r->status(304);
    $r->send_http_header();
  
  
  =head1 Apache::SubProcess
  
  The output of C<system()>, C<exec()>, and C<open(PIPE,"|program")>
  calls will not be sent to the browser unless your Perl was configured
  with C<sfio>.
  
  One workaround is to use backticks:
  
    print `command here`;
  
  But a cleaner solution is provided by the C<Apache::SubProcess>
  module.  It overrides the exec() and system() calls with calls that
  work correctly under mod_perl.
  
  Let's see a few examples:
  
    use Apache::SubProcess qw(system);
    my $r = shift;
    $r->send_http_header('text/plain');
    
    system "/bin/echo hi there";
  
  overrides built-in system() function and sends the output to the
  browser.
  
    use Apache::SubProcess qw(exec);
    my $r = shift;
    $r->send_http_header('text/plain');
    
    exec "/usr/bin/cal"; 
    
    print "NOT REACHED\n";
  
  overrides built-in exec() function and sends the output to the
  browser. As you can see the print statement after the exec() call will
  be never executed.
  
    use Apache::SubProcess ();
    my $r = shift;
    $r->send_http_header('text/plain');
    
    my $efh = $r->spawn_child(\&env);
    $r->send_fd($efh);
    
    sub env {
        my $r = shift;
        $r->subprocess_env(HELLO => 'world');
        $r->filename("/bin/env");
        $r->call_exec;
    }
  
  env() is a function that sets an environment variable that can be seen
  by the main and sub-processes, then it executes I</bin/env> program
  via call_exec(). The main code spawn a process, and tells it to
  execute the env() function. This call returns an output filehandler
  from the spawned child process. Finally it takes the output generated
  by the child process and sends it to the browser via send_fd(), that
  expects the filehandler as an argument.
  
    use Apache::SubProcess ();
    my $r = shift;
    $r->send_http_header('text/plain');
    
    my $fh = $r->spawn_child(\&banner);
    $r->send_fd($fh);
    
    sub banner {
        my $r = shift;
        # /usr/games/banner on many Unices
        $r->filename("/usr/bin/banner");
        $r->args("-w40+Hello%20World");
        $r->call_exec;
    }
  
  This example is very similar to the previous, but shows how can you
  pass arguments to the external process. It passes the string to print
  as a banner to via a subprocess.
  
    use Apache::SubProcess ();
    my $r = shift;
    $r->send_http_header('text/plain');
    
    use vars qw($String);
    $String = "hello world";
    my($out, $in, $err) = $r->spawn_child(\&echo);
    print $out $String;
    $r->send_fd($in);
    
    sub echo {
        my $r = shift;
        $r->subprocess_env(CONTENT_LENGTH => length $String);
        $r->filename("/tmp/pecho");
        $r->call_exec;
    }
  
  The last example shows how you can have a full access to STDIN, STDOUT
  and STDERR streams of the spawned sub process, so you can pipe data to
  a program and send its output to the browser. The echo() function is
  similar to the earlier example's env() function. The I</tmp/pecho> is
  as follows:
  
    !/usr/bin/perl 
    read STDIN, $buf, $ENV{CONTENT_LENGTH}; 
    print "STDIN: `$buf' ($ENV{CONTENT_LENGTH})\n";
  
  So in the last example a string is defined as a global variable, so
  it's length could be calculated in the echo() function. The subprocess
  reads from STDIN, to which the main process writes the string (I<hello
  world>). It reads only a number of bytes specified by 
  C<CONTENT_LENGTH> passed to the external program via environment
  variable. Finally the external program prints the data that it read to
  STDOUT, the main program intercepts it and sends to the client's
  socket (browser in most cases).
  
  =head1 Module::Use - Log and Load used Perl modules
  
  C<Module::Use> records the modules used over the course of the Perl
  interpreter's lifetime.  If the logging module is able, the old logs
  are read and frequently used modules are automatically loaded.
  
  For example if configured as:
  
    <Perl>
        use Module::Use (Counting, Logger => "Debug");
    </Perl>
  
    PerlChildExitHandler Module::Use
  
  it will only record the used modules when the child exists, logging
  everything (debug level).
  
  =head1 Apache::ConfigFile - Parse an Apache style httpd.conf config file
  
  This module parses I<httpd.conf>, or any compatible config file, and
  provides methods for accessing the values from the parsed file.
  
  See the module manpage for more information.
  
  
  =cut
  
  
  
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/multiuser.pod
  
  Index: multiuser.pod
  ===================================================================
  =head1 NAME
  
  mod_perl for ISPs. mod_perl and Virtual Hosts
  
  =head1 ISPs providing mod_perl services - a fantasy or a reality
  
  =over 4
  
  =item *
  
  You installed mod_perl on your box at home, and you fell in love with
  it.  So now you want to convert your CGI scripts (which currently are
  running on your favorite ISPs machine) to run under mod_perl.  Then
  you discover that your ISP has never heard of mod_perl, or he refuses
  to install it for you.
  
  =item *
  
  You are an old sailor in the ISP business, you have seen it all, you
  know how many ISPs are out there and you know that the sales margins
  are too low to keep you happy.  You are looking for some new service
  almost no one else provides, to attract more clients to become your
  users and hopefully to have a bigger slice of the action than your
  competitors.
  
  =back
  
  If you are a user asking for a mod_perl service or an ISP considering
  to provide this service, this section should make things clear for
  both of you.
  
  An ISP has three choices:
  
  =over 4
  
  =item 1
  
  ISPs probably cannot let users run scripts under mod_perl on the main
  server.  There are many reasons for this:
  
  Scripts might leak memory, due to sloppy programming.  There will not
  be enough memory to run as many servers as required, and clients will
  be not satisfied with the service because it will be slower.
  
  The question of file permissions is a very important issue: any user
  who is allowed to write and run a CGI script can at least read (if not
  write) any other files that belong to the same user and/or group the
  web server is running as.  Note that L<it's impossible to run
  C<suEXEC> and C<cgiwrap> extensions under
  mod_perl|install/Is_it_possible_to_run_mod_perl_e>.
  
  Another issue is the security of the database connections.  If you use
  C<Apache::DBI>, by hacking the C<Apache::DBI> code you can pick a
  connection from the pool of cached connections even if it was opened
  by someone else and your scripts are running on the same web server.
  
  There are many more things to be aware of so at this time you have to
  say I<No>.
  
  Of course as an ISP you can run mod_perl internally, without allowing
  your users to map their scripts so that they will run under mod_perl.
  If as a part of your service you provide scripts such as guest books,
  counters etc. which are not available for user modification, you can
  still can have these scripts running very fast.
  
  =item 2
  
  But, hey why can't I let my users run their own servers, so I can wash
  my hands of them and don't have to worry about how dirty and sloppy
  their code is (assuming that the users are running their servers under
  their own usernames, to prevent them from stealing code and data from
  each other).
  
  This option is fine as long as you are not concerned about your new
  systems resource requirements.  If you have even very limited
  experience with mod_perl, you know that mod_perl enabled Apache
  servers while freeing up your CPU and allowing you to run scripts very
  much faster, have huge memory demands (5-20 times that of plain
  Apache).
  
  The size depends on the code length, the sloppiness of the
  programming, possible memory leaks the code might have and all that
  multiplied by the number of children each server spawns.  A very
  simple example: a server, serving an average number of scripts,
  demanding 10Mb of memory which spawns 10 children, already raises your
  memory requirements by 100Mb (the real requirement is actually much
  smaller if your OS allows code sharing between processes and
  programmers exploit these features in their code).  Now multiply the
  average required size by the number of server users you intend to have
  and you will get the total memory requirement.
  
  Since ISPs never say I<No>, you'd better take the inverse approach -
  think of the largest memory size you can afford then divide it by one
  user's requirements as I have shown in this example, and you will know
  how many mod_perl users you can afford :)
  
  But you cannot tell how much memory your users may use?  Their
  requirements from a single server can be very modest, but do you know
  how many servers they will run?  After all, they have full control of
  I<httpd.conf> - and it has to be this way, since this is essential for
  the user running mod_perl.
  
  All this rumbling about memory leads to a single question: is it
  possible to prevent users from using more than X memory?  Or another
  variation of the question: assuming you have as much memory as you
  want, can you charge users for their average memory usage?
  
  If the answer to either of the above questions is I<Yes>, you are all
  set and your clients will prize your name for letting them run
  mod_perl!  There are tools to restrict resource usage (see for example
  the man pages for C<ulimit(3)>, C<getrlimit(2)>, C<setrlimit(2)> and
  C<sysconf(3)>, the last three have the corresponding Perl modules:
  C<BSD::Resource> and C<Apache::Resource>).
  
  [ReaderMETA]: If you have experience with other resource limiting
  techniques please share it with us.  Thank you!
  
  If you have chosen this option, you have to provide your client with:
  
  =over 4
  
  =item *
  
  Shutdown and startup scripts installed together with the rest of your
  daemon startup scripts (e.g I</etc/rc.d> directory), so that when you
  reboot your machine the user's server will be correctly shutdown and
  will be back online the moment your system starts up.  Also make sure
  to start each server under the username the server belongs to, or you
  are going to be in big trouble!
  
  =item *
  
  Proxy services (in forward or httpd accelerator mode) for the user's
  virtual host.  Since the user will have to run their server on an
  unprivileged port (E<gt>1024), you will have to forward all requests
  from C<user.given.virtual.hostname:80> (which is
  C<user.given.virtual.hostname> without the default port 80) to
  C<your.machine.ip:port_assigned_to_user> .  You will also have to tell
  the users to code their scripts so that any self referencing URLs are
  of the form C<user.given.virtual.hostname>.
  
  Letting the user run a mod_perl server immediately adds a requirement
  for the user to be able to restart and configure their own server.
  Only root can bind to port 80, this is why your users have to use port
  numbers greater than 1024.
  
  Another solution would be to use a setuid startup script, but think
  twice before you go with it, since if users can modify the scripts
  they will get a root access. For more information refer to the section
  "L<SUID Start-up Scripts|control/SUID_Start_up_Scripts>".
  
  =item *
  
  Another problem you will have to solve is how to assign ports between
  users.  Since users can pick any port above 1024 to run their server,
  you will have to lay down some rules here so that multiple servers do
  not conflict.
  
  A simple example will demonstrate the importance of this problem: I am
  a malicious user or I am just a rival of some fellow who runs his
  server on your ISP.  All I need to do is to find out what port my
  rival's server is listening to (e.g. using C<netstat(8)>) and
  configure my own server to listen on the same port.  Although I am
  unable to bind to this port, imagine what will happen when you reboot
  your system and my startup script happens to be run before my rivals!
  I get the port first, now all requests will be redirected to my
  server.  I'll leave to your imagination what nasty things might happen
  then.
  
  Of course the ugly things will quickly be revealed, but not before the
  damage has been done.
  
  =back
  
  Basically you can preassign each user a port, without them having to
  worry about finding a free one, as well as enforce C<MaxClients> and
  similar values by implementing the following scenario:
  
  For each user have two configuration files, the main file,
  I<httpd.conf> (non-writable by user) and the user's file,
  I<username.httpd.conf> where they can specify their own configuration
  parameters and override the ones defined in I<httpd.conf>.  Here is
  what the main configuration file looks like:
  
    httpd.conf
    ----------
    # Global/default settings, the user may override some of these
    ...
    ...
    # Included so that user can set his own configuration
    Include username.httpd.conf
  
    # User-specific settings which will override any potentially 
    # dangerous configuration directives in username.httpd.conf
    ...
    ...
  
    username.httpd.conf
    -------------------
    # Settings that your user would like to add/override,
    # like <Location> and PerlModule directives, etc.
  
  Apache reads the global/default settings first.  Then it reads the
  I<Include>'d I<username.httpd.conf> file with whatever settings the
  user has chosen, and finally it reads the user-specific settings that
  we don't want the user to override, such as the port number.  Even if
  the user changes the port number in his I<username.httpd.conf> file,
  Apache reads our settings last, so they take precedence.  Note that
  you can use L<Perl sections|config/Apache_Configuration_in_Perl> to
  make the configuration much easier.
  
  
  
  =item 3
  
  A much better, but costly solution is I<co-location>.  Let the user
  hook his (or your) stand-alone machine into your network, and forget
  about this user.  Of course either the user or you will have to
  undertake all the system administration chores and it will cost your
  client more money.
  
  Who are the people who seek mod_perl support?  They are people who run
  serious projects/businesses.  Money is not usually an obstacle.  They
  can afford a stand alone box, thus achieving their goal of autonomy
  whilst keeping their ISP happy.
  
  =back
  
  =head2 Virtual Servers Technologies
  
  As we have just seen one of the obstacles of using mod_perl in ISP
  environments, is the problem of isolating customers using the same
  machine from each other. A number of virtual servers (don't confuse
  with virtual hosts) technologies (both commercial and Open Source)
  exist today. Here are some of them:
  
  =over
  
  =item * The User-mode Linux Kernel
  
  http://user-mode-linux.sourceforge.net/
  
  User-Mode Linux is a safe, secure way of running Linux versions and
  Linux processes. Run buggy software, experiment with new Linux kernels
  or distributions, and poke around in the internals of Linux, all
  without risking your main Linux setup.
  
  User-Mode Linux gives you a virtual machine that may have more
  hardware and software virtual resources than your actual, physical
  computer. Disk storage for the virtual machine is entirely contained
  inside a single file on your physical machine. You can assign your
  virtual machine only the hardware access you want it to have. With
  properly limited access, nothing you do on the virtual machine can
  change or damage your real computer, or its software.
  
  So if you want to completely protect one user from another and
  yourself from your users this might be yet another alternative to the
  solutions suggested at the beginning of this chapter.
  
  =item * VMWare Technology
  
  Allows running a few instances of the same or different OSs on the
  same machine. This technology comes in two flavors:
  
  open source: http://www.freemware.org/ also known as plex86
  
  commercial: http://www.vmware.com/
  
  So you may want to run a separate OS for each of your clients
  
  =item * freeVSD Technology
  
  freeVSD (http://www.freevsd.org), an open source project sponsored by
  Idaya Ltd. The software enables ISPs to securely partition their
  physical servers into many I<virtual servers>, each capable of running
  popular hosting applications such as Apache, Sendmail and MySQL.
  
  =item * S/390 IBM server
  
  Quoting from: http://www.s390.ibm.com/linux/vif/
  
  "The S/390 Virtual Image Facility enables you to run tens to hundreds
  of Linux server images on a single S/390 server. It is ideally suited
  for those who want to move Linux and/or UNIX workloads deployed on
  multiple servers onto a single S/390 server, while maintaining the
  same number of distinct server images. This provides centralized
  management and operation of the multiple image environment, reducing
  complexity, easing administration and lowering costs."
  
  In two words, this a great solution to huge ISPs, as it allows you to
  run hundreds of mod_perl servers while having only one box to
  maintain. The drawback is the price :)
  
  Check out this scalable mailing list thread for more details from
  those who know:
  http://archive.develooper.com/[EMAIL PROTECTED]/msg00235.html
  
  =back
  
  
  
  =head1 Virtual Hosts in the guide
  
  If you are about to use I<Virtual Hosts> you might want to read these
  sections:
  
  L<Apache Configuration in Perl|config/Apache_Configuration_in_Perl>
  
  L<Easing the Chores of Configuring Virtual Hosts with
  mod_macro|config/Configuring_Apache_mod_perl_wi>
  
  L<Is There a Way to Provide a Different startup.pl File for Each
  Individual Virtual Host|config/Is_There_a_Way_to_Provide_a_Diff>
  
  L<Is There a Way to Modify @INC on a Per-Virtual-Host or Per-Location
  Basis.|config/Is_There_a_Way_to_Modify_INC_on> 
  
  L<A Script From One Virtual Host Calls a Script with the Same Path
  From the Other Virtual Host|config/A_Script_From_One_Virtual_Host_C>
  
  =cut
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/performance.pod
  
  Index: performance.pod
  ===================================================================
  =head1 NAME
  
  Performance Tuning
  
  
  
  =head1 The Big Picture
  
  To make the user's Web browsing experience as painless as possible,
  every effort must be made to wring the last drop of performance from
  the server. There are many factors which affect Web site usability,
  but speed is one of the most important. This applies to any webserver,
  not just Apache, so it is very important that you understand it.
  
  How do we measure the speed of a server?  Since the user (and not the
  computer) is the one that interacts with the Web site, one good speed
  measurement is the time elapsed between the moment when she clicks on
  a link or presses a I<Submit> button to the moment when the resulting
  page is fully rendered.
  
  The requests and replies are broken into packets.  A request may be
  made up of several packets, a reply may be many thousands.  Each
  packet has to make its own way from one machine to another, perhaps
  passing through many interconnection nodes.  We must measure the time
  starting from when the first packet of the request leaves our user's
  machine to when the last packet of the reply arrives back there.
  
  A webserver is only one of the entities the packets see along their
  way.  If we follow them from browser to server and back again, they
  may travel by different routes through many different entities.
  Before they are processed by your server the packets might have to go
  through proxy (accelerator) servers and if the request contains more
  than one packet, packets might arrive to the server by different
  routes with different arrival times, therefore it's possible that some
  packets that arrive earlier will have to wait for other packets before
  they could be reassembled into a chunk of the request message that
  will be then read by the server.  Then the whole process is repeated
  in reverse.
  
  You could work hard to fine tune your webserver's performance, but a
  slow Network Interface Card (NIC) or a slow network connection from
  your server might defeat it all.  That's why it's important to think
  about the Big Picture and to be aware of possible bottlenecks between
  the server and the Web.
  
  Of course there is little that you can do if the user has a slow
  connection.  You might tune your scripts and webserver to process
  incoming requests ultra quickly, so you will need only a small number
  of working servers, but you might find that the server processes are
  all busy waiting for slow clients to accept their responses.
  
  But there are techniques to cope with this.  For example you can
  deliver the respond after it was compressed.  If you are delivering a
  pure text respond--gzip compression will sometimes reduce the size of
  the respond by 10 times.
  
  You should analyze all the involved components when you try to create
  the best service for your users, and not the web server or the code
  that the web server executes.  A Web service is like a car, if one of
  the parts or mechanisms is broken the car may not go smoothly and it
  can even stop dead if pushed too far without first fixing it.
  
  And let me stress it again--if you want to have a success in the web
  service business you should start worrying about the client's browsing
  experience and B<not only> how good your code benchmarks are.
  
  
  
  
  
  
  =head1 System Analysis
  
  Before we try to solve a problem we need to identify it. In our case
  we want to get the best performance we can with as little monetary and
  time investment as possible.
  
  
  =head2 Software Requirements
  
  Covered in the section "L<Choosing an Operating
  System|hardware/Choosing_an_Operating_System>".
  
  
  
  =head2 Hardware Requirements
  
  (META: Only partial analysis. Please submit more points. Many points
  are scattered around the document and should be gathered here, to
  represent the whole picture. It also should be merged with the above
  item!)
  
  You need to analyze all of the problem's dimensions.  There are
  several things that need to be considered:
  
  =over 
  
  =item *
  
  How long does it take to process each request?
  
  =item *
  
  How many requests can you process simultaneously?
  
  =item *
  
  How many simultaneous requests are you planning to get?
  
  =item *
  
  At what rate are you expecting to receive requests?
  
  =back
  
  The first one is probably the easiest to optimize. Following the
  performance optimization tips in this and other documents allows a
  perl (mod_perl) programmer to exercise their code and improve it.
  
  The second one is a function of RAM.  How much RAM is in each box, how
  many boxes do you have, and how much RAM does each mod_perl process
  use?  Multiply the first two and divide by the third.  Ask yourself
  whether it is better to switch to another, possibly just as
  inefficient language or whether that will actually cost more than
  throwing another powerful machine into the rack.
  
  Also ask yourself whether switching to another language will even
  help.  In some applications, for example to link Oracle runtime
  libraries, a huge chunk of memory is needed so you would save nothing
  even if you switched from Perl to C.
  
  The last two are important.  You need a realistic estimate.  Are you
  really expecting 8 million hits per day?  What is the expected peak
  load, and what kind of response time do you need to guarantee?
  Remember that these numbers might change drastically when you apply
  code changes and your site becomes popular. Remember that when you get
  a very high hit rate, the resource requirements don't grow linearly
  but exponentially!
  
  More coverage is provided in the section "L<Choosing
  Hardware|hardware/Choosing_Hardware>".
  
  
  
  
  
  =head1 Essential Tools
  
  In order to improve performance we need measurement tools. The main
  tool categories are benchmarking and code profiling.
  
  It's important to understand that in a major number of the
  benchmarking tests that we will execute we will not look at the
  absolute result numbers but the relation between the two and more
  result sets, since in most cases we would try to show which coding
  approach is preferable and the you shouldn't try to compare the
  absolute results collected while running the same benchmarks on your
  machine, since you won't have the exact hardware and software setup
  anyway. So this kind of comparison would be misleading. Compare the
  relative results from the tests running on your machine, don't compare
  your absolute results with those in this Guide.
  
  
  
  =head2 Benchmarking Applications
  
  How much faster is mod_perl than mod_cgi (aka plain perl/CGI)?  There
  are many ways to benchmark the two.  I'll present a few examples and
  numbers below.  Check out the C<benchmark> directory of the mod_perl
  distribution for more examples.
  
  If you are going to write your own benchmarking utility, use the
  C<Benchmark> module for heavy scripts and the C<Time::HiRes> module
  for very fast scripts (faster than 1 sec) where you will need better
  time precision.
  
  There is no need to write a special benchmark though.  If you want to
  impress your boss or colleagues, just take some heavy CGI script you
  have (e.g. a script that crunches some data and prints the results to
  STDOUT), open 2 xterms and call the same script in mod_perl mode in
  one xterm and in mod_cgi mode in the other. You can use C<lwp-get>
  from the C<LWP> package to emulate the browser.  The C<benchmark>
  directory of the mod_perl distribution includes such an example.
  
  See also two tools for benchmarking:
  L<ApacheBench|performance/Configuration_Tuning_with_ApacheBench> and
  L<crashme test|performance/the_crashme_Script>
  
  
  =head3 Benchmarking Perl Code
  
  If you are going to write your own benchmarking utility, use the
  C<Benchmark> module and the C<Time::HiRes> module where you need
  better time precision (<10msec).
  
  An example of the C<Benchmark.pm> module usage:
  
    benchmark.pl
    ------------
    use Benchmark;
    
    timethis (1_000,
     sub {
      my $x = 100;
      my $y = log ($x ** 100)  for (0..10000);
    });
  
    % perl benchmark.pl
    timethis 1000: 25 wallclock secs (24.93 usr +  0.00 sys = 24.93 CPU)
  
  If you want to get the benchmark results in micro-seconds you will
  have to use the C<Time::HiRes> module, its usage is similar to
  C<Benchmark>'s.
  
    use Time::HiRes qw(gettimeofday tv_interval);
    my $start_time = [ gettimeofday ];
    sub_that_takes_a_teeny_bit_of_time();
    my $end_time = [ gettimeofday ];
    my $elapsed = tv_interval($start_time,$end_time);
    print "The sub took $elapsed seconds."
  
  See also the L<crashme test|performance/the_crashme_Script>.
  
  
  
  =head3 Benchmarking a Graphic Hits Counter with Persistent DB Connections
  
  Here are the numbers from Michael Parker's mod_perl presentation at
  the Perl Conference (Aug, 98).  (Sorry, there used to be links here to
  the source, but they went dead one day, so I removed them).  The
  script is a standard hits counter, but it logs the counts into a mysql
  relational DataBase:
  
      Benchmark: timing 100 iterations of cgi, perl...  [rate 1:28]
      
      cgi: 56 secs ( 0.33 usr 0.28 sys = 0.61 cpu) 
      perl: 2 secs ( 0.31 usr 0.27 sys = 0.58 cpu) 
      
      Benchmark: timing 1000 iterations of cgi,perl...  [rate 1:21]
       
      cgi: 567 secs ( 3.27 usr 2.83 sys = 6.10 cpu) 
      perl: 26 secs ( 3.11 usr 2.53 sys = 5.64 cpu)      
      
      Benchmark: timing 10000 iterations of cgi, perl   [rate 1:21]
       
      cgi: 6494 secs (34.87 usr 26.68 sys = 61.55 cpu) 
      perl: 299 secs (32.51 usr 23.98 sys = 56.49 cpu) 
  
  We don't know what server configurations were used for these tests,
  but I guess the numbers speak for themselves.
  
  The source code of the script was available at
  http://www.realtime.net/~parkerm/perl/conf98/sld006.htm. It's now a
  dead link.  If you know its new location, please let me know.
  
  
  
  
  
  
  
  
  
  
  
  
  
  =head3 Benchmarking Response Times
  
  In the next sections we will talk about tools that allow us to
  benchmark response times.
  
  =head4 ApacheBench
  
  ApacheBench (B<ab>) is a tool for benchmarking your Apache HTTP
  server.  It is designed to give you an idea of the performance that
  your current Apache installation can give.  In particular, it shows
  you how many requests per second your Apache server is capable of
  serving.  The B<ab> tool comes bundled with the Apache source
  distribution.
  
  Let's try it.  We will simulate 10 users concurrently requesting a
  very light script at C<www.example.com/perl/test.pl>.  Each simulated
  user makes 10 requests.
  
    % ./ab -n 100 -c 10 www.example.com/perl/test.pl
  
  The results are:
  
    Document Path:          /perl/test.pl
    Document Length:        319 bytes
    
    Concurrency Level:      10
    Time taken for tests:   0.715 seconds
    Complete requests:      100
    Failed requests:        0
    Total transferred:      60700 bytes
    HTML transferred:       31900 bytes
    Requests per second:    139.86
    Transfer rate:          84.90 kb/s received
    
    Connection Times (ms)
                  min   avg   max
    Connect:        0     0     3
    Processing:    13    67    71
    Total:         13    67    74
  
  We can see that under load of ten concurrent users our server is
  capable of processing 140 requests per second. Of course this
  benchmark is correct only when the script under test is used. We can
  also learn about the average processing time, which in this case was
  67 milli-seconds. Other numbers reported by C<ab> may or may not be of
  interest to you.
  
  For example if we believe that the script I<perl/test.pl> is not
  efficient we will try to improve it and run the benchmark again, to
  see whether we have any improve in performance.
  
  
  
  =head4 httperf
  
  httperf is a utility written by David Mosberger.  Just like
  ApacheBench, it measures the performance of the webserver.
  
  A sample command line is shown below:
  
    httperf --server hostname --port 80 --uri /test.html \
     --rate 150 --num-conn 27000 --num-call 1 --timeout 5
  
  This command causes httperf to use the web server on the host with IP
  name hostname, running at port 80.  The web page being retrieved is
  I</test.html> and, in this simple test, the same page is retrieved
  repeatedly.  The rate at which requests are issued is 150 per second.
  The test involves initiating a total of 27,000 TCP connections and on
  each connection one HTTP call is performed.  A call consists of
  sending a request and receiving a reply.
  
  The timeout option defines the number of seconds that the client is
  willing to wait to hear back from the server.  If this timeout
  expires, the tool considers the corresponding call to have failed.
  Note that with a total of 27,000 connections and a rate of 150 per
  second, the total test duration will be approximately 180 seconds
  (27,000/150), independently of what load the server can actually
  sustain.  Here is a result that one might get:
  
       Total: connections 27000 requests 26701 replies 26701 test-duration 
179.996 s
      
       Connection rate: 150.0 conn/s (6.7 ms/conn, <=47 concurrent connections)
       Connection time [ms]: min 1.1 avg 5.0 max 315.0 median 2.5 stddev 13.0
       Connection time [ms]: connect 0.3
       
       Request rate: 148.3 req/s (6.7 ms/req)
       Request size [B]: 72.0
       
       Reply rate [replies/s]: min 139.8 avg 148.3 max 150.3 stddev 2.7 (36 
samples)
       Reply time [ms]: response 4.6 transfer 0.0
       Reply size [B]: header 222.0 content 1024.0 footer 0.0 (total 1246.0)
       Reply status: 1xx=0 2xx=26701 3xx=0 4xx=0 5xx=0
       
       CPU time [s]: user 55.31 system 124.41 (user 30.7% system 69.1% total 
99.8%)
       Net I/O: 190.9 KB/s (1.6*10^6 bps)
       
       Errors: total 299 client-timo 299 socket-timo 0 connrefused 0 connreset 0
       Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0
  
  L<httperf download|download/httperf____webserver_Benchmarking_tool>
  
  =head4 http_load
  
  C<http_load> is yet another utility that does webserver load
  testing. It can simulate 33.6kbps modem connection (I<-throttle>) and
  allows you to provide a file with a list of URLs, which we be fetched
  randomly. You can specify how many parallel connections to run using
  the I<-parallel N> option, or you can specify the number of requests
  to generate per second with I<-rate N> option. Finally you can tell
  the utility when to stop by specifying either the test time length
  (I<-seconds N>) or the total number of fetches (I<-fetches N>).
  
  A sample run with the file I<urls> including:
  
    http://www.example.com/foo/
    http://www.example.com/bar/
  
  We ask to generate three requests per second and run for only two
  seconds. Here is the generated output:
  
    % ./http_load -rate 3 -seconds 2 urls
    http://www.example.com/foo/: check-connect SUCCEEDED, ignoring
    http://www.example.com/bar/: check-connect SUCCEEDED, ignoring
    http://www.example.com/bar/: check-connect SUCCEEDED, ignoring
    http://www.example.com/bar/: check-connect SUCCEEDED, ignoring
    http://www.example.com/foo/: check-connect SUCCEEDED, ignoring
    5 fetches, 3 max parallel, 96870 bytes, in 2.00258 seconds
    19374 mean bytes/connection
    2.49678 fetches/sec, 48372.7 bytes/sec
    msecs/connect: 1.805 mean, 5.24 max, 0.79 min
    msecs/first-response: 291.289 mean, 560.338 max, 34.349 min
  
  So you can see that it has reported 2.5 requests per second. Of course
  for the real test you will want to load the server heavily and run the
  test for a longer time to get more reliable results.
  
  Note that when you provide a file with a list of URLs make sure that
  you don't have empty lines in it. If you do -- the utility won't work
  complaining:
  
    ./http_load: unknown protocol - 
  
  L<http_load 
download|download/http_load____another_webserver_Benchmarking_tool>
  
  
  =head4 the crashme Script
  
  This is another crashme suite originally written by Michael Schilli
  (and was located at http://www.linux-magazin.de site, but now the link
  has gone).  I made a few modifications, mostly adding my() operators.
  I also allowed it to accept more than one url to test, since sometimes
  you want to test more than one script.
  
  The tool provides the same results as B<ab> above but it also allows
  you to set the timeout value, so requests will fail if not served
  within the time out period.  You also get values for B<Latency>
  (seconds per request) and B<Throughput> (requests per second).  It can
  do a complete simulation of your favorite Netscape browser :) and give
  you a better picture.
  
  I have noticed while running these two benchmarking suites, that B<ab>
  gave me results from two and a half to three times better.  Both
  suites were run on the same machine, with the same load and the same
  parameters, but the implementations were different.
  
  Sample output:
  
    URL(s):          http://www.example.com/perl/access/access.cgi
    Total Requests:  100
    Parallel Agents: 10
    Succeeded:       100 (100.00%)
    Errors:          NONE
    Total Time:      9.39 secs
    Throughput:      10.65 Requests/sec
    Latency:         0.85 secs/Request
  
  And the code:
  
  =code lwp-bench.pl The LWP::Parallel::UserAgent benchmark
  
  
  =head3 Benchmarking PerlHandlers
  
  The C<Apache::Timeit> module does C<PerlHandler> Benchmarking.  With
  the help of this module you can log the time taken to process the
  request, just like you'd use the C<Benchmark> module to benchmark a
  regular Perl script.  Of course you can extend this module to perform
  more advanced processing like putting the results into a database for
  a later processing.  But all it takes is adding this configuration
  directive inside I<httpd.conf>:
  
    PerlFixupHandler Apache::Timeit
  
  Since scripts running under C<Apache::Registry> are running inside the
  PerlHandler these are benchmarked as well.
  
  An example of the lines which show up in the I<error_log> file:
  
    timing request for /perl/setupenvoff.pl:
      0 wallclock secs ( 0.04 usr +  0.01 sys =  0.05 CPU)
    timing request for /perl/setupenvoff.pl:
      0 wallclock secs ( 0.03 usr +  0.00 sys =  0.03 CPU)
  
  The C<Apache::Timeit> package is a part of the I<Apache-Perl-contrib>
  files collection available from CPAN.
  
  
  
  
  =head2 Code Profiling Techniques
  
  The profiling process helps you to determine which subroutines or just
  snippets of code take the longest time to execute and which subroutines
  are called most often.  Probably you will want to optimize those.
  
  When do you need to profile your code? You do that when you suspect
  that some part of your code is called very often and may be there is a
  need to optimize it to significantly improve the overall performance.
  
  For example if you have ever used the C<diagnostics> pragma, which
  extends the terse diagnostics normally emitted by both the Perl
  compiler and the Perl interpreter, augmenting them with the more
  verbose and endearing descriptions found in the C<perldiag> manpage.
  You know that it might tremendously slow you code down, so let's first
  prove that it is correct.
  
  We will run a benchmark, once with diagnostics enabled and once
  disabled, on a subroutine called I<test_code>.
  
  The code inside the subroutine does an arithmetic and a numeric
  comparison of two strings.  It assigns one string to another if the
  condition tests true but the condition always tests false.  To
  demonstrate the C<diagnostics> overhead the comparison operator is
  intentionally I<wrong>.  It should be a string comparison, not a
  numeric one.
  
    use Benchmark;
    use diagnostics;
    use strict;
    
    my $count = 50000;
    
    disable diagnostics;
    my $t1 = timeit($count,\&test_code);
    
    enable  diagnostics;
    my $t2 = timeit($count,\&test_code);
    
    print "Off: ",timestr($t1),"\n";
    print "On : ",timestr($t2),"\n";
    
    sub test_code{
      my ($a,$b) = qw(foo bar);
      my $c;
      if ($a == $b) {
        $c = $a;
      }
    }
  
  For only a few lines of code we get:
  
    Off:  1 wallclock secs ( 0.81 usr +  0.00 sys =  0.81 CPU)
    On : 13 wallclock secs (12.54 usr +  0.01 sys = 12.55 CPU)
  
  With C<diagnostics> enabled, the subroutine test_code() is 16 times
  slower, than with C<diagnostics> disabled!
  
  Now let's fix the comparison the way it should be, by replacing C<==>
  with C<eq>, so we get:
  
      my ($a,$b) = qw(foo bar);
      my $c;
      if ($a eq $b) {
        $c = $a;
      }
  
  and run the same benchmark again:
  
    Off:  1 wallclock secs ( 0.57 usr +  0.00 sys =  0.57 CPU)
    On :  1 wallclock secs ( 0.56 usr +  0.00 sys =  0.56 CPU)
  
  Now there is no overhead at all.  The C<diagnostics> pragma slows
  things down only when warnings are generated.
  
  After we have verified that using the C<diagnostics> pragma might adds
  a big overhead to execution runtime, let's use the code profiling to
  understand why this happens. We are going to use C<Devel::DProf> to
  profile the code.  Let's use this code:
  
    diagnostics.pl
    --------------
    use diagnostics;
    print "Content-type:text/html\n\n";
    test_code();
    sub test_code{
      my ($a,$b) = qw(foo bar);
      my $c;
      if ($a == $b) {
        $c = $a;
      }
    }
  
  Run it with the profiler enabled, and then create the profiling
  stastics with the help of dprofpp:
  
    % perl -d:DProf diagnostics.pl
    % dprofpp
    
    Total Elapsed Time = 0.342236 Seconds
      User+System Time = 0.335420 Seconds
    Exclusive Times
    %Time ExclSec CumulS #Calls sec/call Csec/c  Name
     92.1   0.309  0.358      1   0.3089 0.3578  main::BEGIN
     14.9   0.050  0.039   3161   0.0000 0.0000  diagnostics::unescape
     2.98   0.010  0.010      2   0.0050 0.0050  diagnostics::BEGIN
     0.00   0.000 -0.000      2   0.0000      -  Exporter::import
     0.00   0.000 -0.000      2   0.0000      -  Exporter::export
     0.00   0.000 -0.000      1   0.0000      -  Config::BEGIN
     0.00   0.000 -0.000      1   0.0000      -  Config::TIEHASH
     0.00   0.000 -0.000      2   0.0000      -  Config::FETCH
     0.00   0.000 -0.000      1   0.0000      -  diagnostics::import
     0.00   0.000 -0.000      1   0.0000      -  main::test_code
     0.00   0.000 -0.000      2   0.0000      -  diagnostics::warn_trap
     0.00   0.000 -0.000      2   0.0000      -  diagnostics::splainthis
     0.00   0.000 -0.000      2   0.0000      -  diagnostics::transmo
     0.00   0.000 -0.000      2   0.0000      -  diagnostics::shorten
     0.00   0.000 -0.000      2   0.0000      -  diagnostics::autodescribe
  
  It's not easy to see what is responsible for this enormous overhead,
  even if C<main::BEGIN> seems to be running most of the time.  To get
  the full picture we must see the OPs tree, which shows us who calls
  whom, so we run:
  
    % dprofpp -T
  
  and the output is:
  
   main::BEGIN
     diagnostics::BEGIN
        Exporter::import
           Exporter::export
     diagnostics::BEGIN
        Config::BEGIN
        Config::TIEHASH
        Exporter::import
           Exporter::export
     Config::FETCH
     Config::FETCH
     diagnostics::unescape
     .....................
     3159 times [diagnostics::unescape] snipped
     .....................
     diagnostics::unescape
     diagnostics::import
   diagnostics::warn_trap
     diagnostics::splainthis
        diagnostics::transmo
        diagnostics::shorten
        diagnostics::autodescribe
   main::test_code
     diagnostics::warn_trap
        diagnostics::splainthis
           diagnostics::transmo
           diagnostics::shorten
           diagnostics::autodescribe
     diagnostics::warn_trap
        diagnostics::splainthis
           diagnostics::transmo
           diagnostics::shorten
          diagnostics::autodescribe
  
  So we see that two executions of C<diagnostics::BEGIN> and 3161 of
  C<diagnostics::unescape> are responsible for most of the running
  overhead.
  
  If we comment out the C<diagnostics> module, we get:
  
    Total Elapsed Time = 0.079974 Seconds
      User+System Time = 0.059974 Seconds
    Exclusive Times
    %Time ExclSec CumulS #Calls sec/call Csec/c  Name
     0.00   0.000 -0.000      1   0.0000      -  main::test_code
  
  
  It is possible to profile code running under mod_perl with the
  C<Devel::DProf> module, available on CPAN.  However, you must have
  apache version 1.3b3 or higher and the C<PerlChildExitHandler> enabled
  during the httpd build process.  When the server is started,
  C<Devel::DProf> installs an C<END> block to write the I<tmon.out>
  file.  This block will be called at server shutdown. Here is how to
  start and stop a server with the profiler enabled:
  
    % setenv PERL5OPT -d:DProf
    % httpd -X -d `pwd` &
    ... make some requests to the server here ...
    % kill `cat logs/httpd.pid`
    % unsetenv PERL5OPT
    % dprofpp
  
  The C<Devel::DProf> package is a Perl code profiler.  It will collect
  information on the execution time of a Perl script and of the subs in
  that script (remember that C<print()> and C<map()> are just like any
  other subroutines you write, but they come bundled with Perl!)
  
  Another approach is to use C<Apache::DProf>, which hooks
  C<Devel::DProf> into mod_perl.  The C<Apache::DProf> module will run a
  C<Devel::DProf> profiler inside each child server and write the
  I<tmon.out> file in the directory C<$ServerRoot/logs/dprof/$$> when
  the child is shutdown (where C<$$> is the number of the child
  process).  All it takes is to add to I<httpd.conf>:
  
    PerlModule Apache::DProf
  
  Remember that any PerlHandler that was pulled in before
  C<Apache::DProf> in the I<httpd.conf> or I<startup.pl>, will not have
  its code debugging information inserted.  To run C<dprofpp>, chdir to
  C<$ServerRoot/logs/dprof/$$> and run:
  
    % dprofpp
  
  (Lookup the C<ServerRoot> directive's value in I<httpd.conf> to figure
  out what's your C<$ServerRoot>.)
  
  
  =head2 Measuring the Memory of the Process
  
  Very important aspect of performance tuning is to make sure that your
  applications don't use much memory, since if they do you cannot run
  many servers and therefore in most cases under a heavy load the
  overall performance degrades. 
  
  In addition the code may not be clean and leak memory, which is even
  worse, since if the same process serves many requests and after each
  request more memory is used, after awhile all RAM will be used and
  machine will start swapping (use the swap partition) which is a very
  undesirable event, since it may lead to a machine crash.
  
  The simplest way to figure out how big the processes are and see
  whether they grow is to watch the output of top(1) or ps(1) utilities.
  
  For example the output of top(1):
  
      8:51am  up 66 days,  1:44,  1 user,  load average: 1.09, 2.27, 2.61
    95 processes: 92 sleeping, 3 running, 0 zombie, 0 stopped
    CPU states: 54.0% user,  9.4% system,  1.7% nice, 34.7% idle
    Mem:  387664K av, 309692K used,  77972K free, 111092K shrd,  70944K buff
    Swap: 128484K av,  11176K used, 117308K free                170824K cached
  
       PID USER PRI NI SIZE  RSS SHARE STAT LIB %CPU %MEM   TIME COMMAND
    29225 nobody 0  0  9760 9760  7132 S      0 12.5  2.5   0:00 httpd_perl
    29220 nobody 0  0  9540 9540  7136 S      0  9.0  2.4   0:00 httpd_perl
    29215 nobody 1  0  9672 9672  6884 S      0  4.6  2.4   0:01 httpd_perl
    29255 root   7  0  1036 1036   824 R      0  3.2  0.2   0:01 top
      376 squid  0  0 15920  14M   556 S      0  1.1  3.8 209:12 squid
    29227 mysql  5  5  1892 1892   956 S N    0  1.1  0.4   0:00 mysqld
    29223 mysql  5  5  1892 1892   956 S N    0  0.9  0.4   0:00 mysqld
    29234 mysql  5  5  1892 1892   956 S N    0  0.9  0.4   0:00 mysqld
  
  Which starts with overall information of the system and then displays
  the most active processes at the given moment. So for example if we
  look at the C<httpd_perl> processes we can see the size of the
  resident (C<RSS>) and shared (C<SHARE>) memory segments. This sample
  was taken on the production server running linux.
  
  But of course we want to see all the apache/mod_perl processes, and
  that's where ps(1) comes to help. The options of this utility vary
  from one Unix flavor to another, and some flavors provide their own
  tools. Let's check the information about mod_perl processes:
  
    % ps -o pid,user,rss,vsize,%cpu,%mem,ucomm -C httpd_perl
      PID USER      RSS   VSZ %CPU %MEM COMMAND
    29213 root     8584 10264  0.0  2.2 httpd_perl
    29215 nobody   9740 11316  1.0  2.5 httpd_perl
    29216 nobody   9668 11252  0.7  2.4 httpd_perl
    29217 nobody   9824 11408  0.6  2.5 httpd_perl
    29218 nobody   9712 11292  0.6  2.5 httpd_perl
    29219 nobody   8860 10528  0.0  2.2 httpd_perl
    29220 nobody   9616 11200  0.5  2.4 httpd_perl
    29221 nobody   8860 10528  0.0  2.2 httpd_perl
    29222 nobody   8860 10528  0.0  2.2 httpd_perl
    29224 nobody   8860 10528  0.0  2.2 httpd_perl
    29225 nobody   9760 11340  0.7  2.5 httpd_perl
    29235 nobody   9524 11104  0.4  2.4 httpd_perl
  
  Now you can see the resident (C<RSS>) and virtual (C<VSZ>) memory
  segments (and shared memory segment if you ask for it) of all mod_perl
  processes. Please refer to the top(1) and ps(1) man pages for more
  information.
  
  You probably agree that using top(1) and ps(1) is cumbersome if we
  want to use memory size sampling during the benchmark test. We want to
  have a way to print memory sizes during the program execution at
  desired places. If you have C<GTop> modules installed, which is a perl
  glue to the C<libgtop> library, it's exactly what we need.
  
  Note: C<GTop> requires the C<libgtop> library but is not available for
  all platforms.  Visit http://www.home-of-linux.org/gnome/libgtop/ to
  check whether your platform/flavor is supported.
  
  C<GTop> provides an API for retrieval of information about processes
  and the whole system. We are interested only in memory sampling API
  methods. To print all the process related memory information we can
  execute the following code:
  
    use GTop;
    my $gtop = GTop->new;
    my $proc_mem = $gtop->proc_mem($$);
    for (qw(size vsize share rss)) {
        printf "   %s => %d\n", $_, $proc_mem->$_();
    }
  
  When executed we see the following output (in bytes):
  
        size => 1900544
       vsize => 3108864
       share => 1392640
         rss => 1900544
  
  So if we are interested in to print the process resident memory
  segment before and after some event we just do it: For example if we
  want to see how much extra memory was allocated after a variable
  creation we can write the following code:
  
    use GTop;
    my $gtop = GTop->new;
    my $before = $gtop->proc_mem($$)->rss;
    my $x = 'a' x 10000;
    my $after  = $gtop->proc_mem($$)->rss;
    print "diff: ",$after-$before, " bytes\n";
  
  and the output
  
    diff: 20480 bytes
  
  So we can see that Perl has allocated extra 20480 bytes to create
  C<$x> (of course the creation of C<after> needed a few bytes as well,
  but it's insignificant compared to a size of C<$x>)
  
  The C<Apache::VMonitor> module with help of the C<GTop> module allows
  you to watch all your system information using your favorite browser
  from anywhere in the world without a need to telnet to your machine.
  If you are looking at what information you can retrieve with C<GTop>,
  you should look at C<Apache::VMonitor> as it deploys a big part of
  the API C<GTop> provides.
  
  If you are running a true BSD system, you may use
  C<BSD::Resource::getrusage> instead of C<GTop>. For example:
  
    print "used memory = ".(BSD::Resource::getrusage)[2]."\n"
  
  For more information refer to the C<BSD::Resource> manpage.
  
  
  
  =head2 Measuring the Memory Usage of Subroutines 
  
  With help of C<Apache::Status> you can find out the size of each
  and every subroutine.
  
  
  =over
  
  =item 1 
  
  Build and install mod_perl as you always do, make sure it's version
  1.22 or higher.
  
  =item 1 
  
  Configure /perl-status if you haven't already:
  
    <Location /perl-status>
      SetHandler perl-script
      PerlHandler Apache::Status
      order deny,allow
      #deny from all
      #allow from ...
    </Location>
  
  =item 1 
  
  Add to httpd.conf
  
    PerlSetVar StatusOptionsAll On
    PerlSetVar StatusTerse On
    PerlSetVar StatusTerseSize On
    PerlSetVar StatusTerseSizeMainSummary On
  
    PerlModule B::TerseSize
  
  =item 1 
  
  Start the server (best in httpd -X mode)
  
  =item 1 
  
  From your favorite browser fetch http://localhost/perl-status
  
  =item 1 
  
  Click on 'Loaded Modules' or 'Compiled Registry Scripts'
  
  =item 1 
  
  Click on the module or script of your choice (you might need
  to run some script/handler before you will see it here unless it was
  preloaded)
  
  =item 1 
  
  Click on 'Memory Usage' at the bottom
  
  =item 1 
  
  You should see all the subroutines and their respective sizes.
  
  =back
  
  Now you can start to optimize your code. Or test which of the several
  implementations is of the least size.
  
  For example let's compare C<CGI.pm>'s OO vs. procedural interfaces:
  
  As you will see below the first OO script uses about 2k bytes while
  the second script (procedural interface) uses about 5k.
  
  Here are the code examples and the numbers:
  
  =over
  
  =item 1
  
    cgi_oo.pl
    ---------
    use CGI ();
    my $q = CGI->new;
    print $q->header;
    print $q->b("Hello");
  
  =item 2
  
    cgi_mtd.pl
    ---------
    use CGI qw(header b);
    print header();
    print b("Hello");
  
  =back
  
  After executing each script in single server mode (-X) the results are:
  
  =over
  
  =item 1
  
    Totals: 1966 bytes | 27 OPs
  
    handler 1514 bytes | 27 OPs
    exit     116 bytes |  0 OPs
  
  =item 2
  
    Totals: 4710 bytes | 19 OPs
    
    handler  1117 bytes | 19 OPs
    basefont  120 bytes |  0 OPs
    frameset  120 bytes |  0 OPs
    caption   119 bytes |  0 OPs
    applet    118 bytes |  0 OPs
    script    118 bytes |  0 OPs
    ilayer    118 bytes |  0 OPs
    header    118 bytes |  0 OPs
    strike    118 bytes |  0 OPs
    layer     117 bytes |  0 OPs
    table     117 bytes |  0 OPs
    frame     117 bytes |  0 OPs
    style     117 bytes |  0 OPs
    Param     117 bytes |  0 OPs
    small     117 bytes |  0 OPs
    embed     117 bytes |  0 OPs
    font      116 bytes |  0 OPs
    span      116 bytes |  0 OPs
    exit      116 bytes |  0 OPs
    big       115 bytes |  0 OPs
    div       115 bytes |  0 OPs
    sup       115 bytes |  0 OPs
    Sub       115 bytes |  0 OPs
    TR        114 bytes |  0 OPs
    td        114 bytes |  0 OPs
    Tr        114 bytes |  0 OPs
    th        114 bytes |  0 OPs
    b         113 bytes |  0 OPs
  
  
  =back
  
  Note, that the above is correct if you didn't precompile all
  C<CGI.pm>'s methods at server startup. Since if you did, the
  procedural interface in the second test will take up to 18k and not 5k
  as we saw. That's because the whole of C<CGI.pm>'s namespace is
  inherited and it already has all its methods compiled, so it doesn't
  really matter whether you attempt to import only the symbols that you
  need.  So if you have:
  
    use CGI  qw(-compile :all);
  
  in the server startup script. Having:
  
    use CGI qw(header);
  
  or
  
    use CGI qw(:all);
  
  is essentially the same. You will have all the symbols precompiled at
  startup imported even if you ask for only one symbol.  It
  seems to me like a bug, but probably that's how C<CGI.pm> works.
  
  
  BTW, you can check the number of opcodes in the code by a simple
  command line run. For example comparing S<'my %hash'> vs. S<'my %hash =
  ()'>.
  
    % perl -MO=Terse -e 'my %hash' | wc -l
    -e syntax OK
        4
  
    % perl -MO=Terse -e 'my %hash = ()' | wc -l
    -e syntax OK
       10
  
  The first one has less opcodes.
  
  Note that you shouldn't use C<Apache::Status> module on production
  server as it adds quite a bit of overhead for each request.
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  =head1 Know Your Operating System
  
  
  In order to get the best performance it helps to get intimately
  familiar with the Operating System (OS) the web server is running
  on. There are many OS specific things that you may be able to optimize
  which will improve your web server's speed, reliability and security.
  
  The following sections will reveal some of the most important details
  you should know about your OS.
  
  
  =head2 Sharing Memory
  
  The sharing of memory is one very important factor.  If your OS
  supports it (and most sane systems do), you might save memory by
  sharing it between child processes.  This is only possible when you
  preload code at server startup.  However, during a child process' life
  its memory pages tend to become unshared.
  
  There is no way we can make Perl allocate memory so that (dynamic)
  variables land on different memory pages from constants, so the
  B<copy-on-write> effect (we will explain this in a moment) will hit
  you almost at random.
  
  If you are pre-loading many modules you might be able to trade off the
  memory that stays shared against the time for an occasional fork by
  tuning C<MaxRequestsPerChild>.  Each time a child reaches this upper
  limit and dies it should release its unshared pages.  The new child
  which replaces it will share its fresh pages until it scribbles on
  them.
  
  The ideal is a point where your processes usually restart before too
  much memory becomes unshared.  You should take some measurements to
  see if it makes a real difference, and to find the range of reasonable
  values.  If you have success with this tuning the value of
  C<MaxRequestsPerChild> will probably be peculiar to your situation and
  may change with changing circumstances.
  
  It is very important to understand that your goal is not to have
  C<MaxRequestsPerChild> to be 10000. Having a child serving 300
  requests on precompiled code is already a huge overall speedup, so if
  it is 100 or 10000 it probably does not really matter if you can save
  RAM by using a lower value.
  
  Do not forget that if you preload most of your code at server startup,
  the newly forked child gets ready very fast, because it inherits
  most of the preloaded code and the perl interpreter from the parent
  process.
  
  During the life of the child its memory pages (which aren't really its
  own to start with, it uses the parent's pages) gradually get `dirty' -
  variables which were originally inherited and shared are updated or
  modified -- and the I<copy-on-write> happens.  This reduces the number
  of shared memory pages, thus increasing the memory requirement.
  Killing the child and spawning a new one allows the new child to get
  back to the pristine shared memory of the parent process.
  
  The recommendation is that C<MaxRequestsPerChild> should not be too
  large, otherwise you lose some of the benefit of sharing memory.
  
  See L<Choosing
  MaxRequestsPerChild|performance/Choosing_MaxRequestsPerChild> for more
  about tuning the C<MaxRequestsPerChild> parameter.
  
  
  =head3 How Shared Is My Memory?
  
  You've probably noticed that the word shared is repeated many times in
  relation to mod_perl.  Indeed, shared memory might save you a lot of
  money, since with sharing in place you can run many more servers than
  without it.  See L<the Formula and the
  numbers|performance/Choosing_MaxClients>.
  
  How much shared memory do you have?  You can see it by either using
  the memory utility that comes with your system or you can deploy the
  C<GTop> module:
  
    use GTop ();
    print "Shared memory of the current process: ",
      GTop->new->proc_mem($$)->share,"\n";
    
    print "Total shared memory: ",
      GTop->new->mem->share,"\n";
  
  When you watch the output of the C<top> utility, don't confuse the
  C<RES> (or C<RSS>) columns with the C<SHARE> column.  C<RES> is
  RESident memory, which is the size of pages currently swapped in.
  
  
  
  
  =head3 Calculating Real Memory Usage
  
  I have shown how to measure the size of the process' shared memory,
  but we still want to know what the real memory usage is.  Obviously
  this cannot be calculated simply by adding up the memory size of each
  process because that wouldn't account for the shared memory.
  
  On the other hand we cannot just subtract the shared memory size from
  the total size to get the real memory usage numbers, because in
  reality each process has a different history of processed requests,
  therefore the shared memory is not the same for all processes.
  
  So how do we measure the real memory size used by the server we run?
  It's probably too difficult to give the exact number, but I've found a
  way to get a fair approximation which was verified in the following
  way. I have calculated the real memory used, by the technique you will
  see in the moment, and then have stopped the Apache server and saw
  that the memory usage report indicated that the total used memory went
  down by almost the same number I've calculated.  Note that some
  OSs do smart memory pages caching so you may not see the memory usage
  decrease as soon as it actually happens when you quit the application.
  
  This is a technique I've used:
  
  =over
  
  =item 1
  
  For each process sum up the difference between shared and system
  memory. To calculate a difference for a single process use:
  
    use GTop;
    my $proc_mem = GTop->new->proc_mem($$);
    my $diff     = $proc_mem->size - $proc_mem->share;
    print "Difference is $diff bytes\n";
  
  =item 2
  
  Now if we add the shared memory size of the process with maximum
  shared memory, we will get all the memory that actually is being used
  by all httpd processes, except for the parent process.
  
  =item 3 
  
  Finally, add the size of the parent process.
  
  =back
  
  Please note that this might be incorrect for your system, so you use
  this number on your own risk.
  
  I've used this technique to display real memory usage in the module
  
L<Apache::VMonitor|debug/Apache__VMonitor____Visual_System_and_Apache_Server_Monitor>,
 so instead of
  trying to manually calculate this number you can use this module to do
  it automatically.  In fact in the calculations used in this module
  there is no separation between the parent and child processes, they
  are all counted indifferently using the following code:
  
    use GTop ();
    my $gtop = GTop->new;
    my $total_real = 0;
    my $max_shared = 0;
    # @mod_perl_pids is initialized by Apache::Scoreboard, irrelevant here
    my @mod_perl_pids = some_code();
    for my $pid (@mod_perl_pids)
      my $proc_mem = $gtop->proc_mem($pid);
      my $size     = $proc_mem->size($pid);
      my $share    = $proc_mem->share($pid);
      $total_real += $size - $share;
      $max_shared  = $share if $max_shared < $share;
    }
    my $total_real += $max_shared;
  
  So as you see we that we accumulate the difference between the shared
  and reported memory:
  
      $total_real  += $size-$share;
  
  and at the end add the biggest shared process size:
  
    my $total_real += $max_shared;
  
  So now C<$total_real> contains approximately the really used memory.
  
  
  =head3 Are My Variables Shared?
  
  How do you find out if the code you write is shared between the
  processes or not?  The code should be shared, except where it is on a
  memory page with variables that change.  Some variables are read-only
  in usage and never change.  For example, if you have some variables
  that use a lot of memory and you want them to be read-only.  As you
  know the variable becomes unshared when the process modifies its
  value.
  
  So imagine that you have this 10Mb in-memory database that resides in
  a single variable, you perform various operations on it and want to
  make sure that the variable is still shared.  For example if you do
  some matching regular expression (regex) processing on this variable
  and want to use the pos() function, will it make the variable unshared
  or not?
  
  The C<Apache::Peek> module comes to rescue.  Let's write a module
  called I<MyShared.pm> which we preload at server startup, so all the
  variables of this module are initially shared by all children.
  
    MyShared.pm
    ---------
    package MyShared;
    use Apache::Peek;
    
    my $readonly = "Chris";
    
    sub match    { $readonly =~ /\w/g;               }
    sub print_pos{ print "pos: ",pos($readonly),"\n";}
    sub dump     { Dump($readonly);                  }
    1;
  
  This module declares the package C<MyShared>, loads the
  C<Apache::Peek> module and defines the lexically scoped C<$readonly>
  variable which is supposed to be a variable of large size (think about
  a huge hash data structure), but we will use a small one to simplify
  this example.
  
  The module also defines three subroutines: match() that does a simple
  character matching, print_pos() that prints the current position of
  the matching engine inside the string that was last matched and
  finally the dump() subroutine that calls the C<Apache::Peek> module's
  Dump() function to dump a raw Perl data-type of the C<$readonly>
  variable.
  
  Now we write the script that prints the process ID (PID) and calls all
  three functions.  The goal is to check whether pos() makes the
  variable I<dirty> and therefore unshared.
  
    share_test.pl
    -------------
    use MyShared;
    print "Content-type: text/plain\r\n\r\n";
    print "PID: $$\n";
    MyShared::match();
    MyShared::print_pos();
    MyShared::dump();
  
  Before you restart the server, in I<httpd.conf> set:
  
    MaxClients 2
  
  for easier tracking.  You need at least two servers to compare the
  print outs of the test program.  Having more than two can make the
  comparison process harder.
  
  Now open two browser windows and issue the request for this script
  several times in both windows, so you get different processes PIDs
  reported in the two windows and each process has processed a different
  number of requests to the I<share_test.pl> script.
  
  In the first window you will see something like that:
  
    PID: 27040
    pos: 1
    SV = PVMG(0x853db20) at 0x8250e8c
      REFCNT = 3
      FLAGS = (PADBUSY,PADMY,SMG,POK,pPOK)
      IV = 0
      NV = 0
      PV = 0x8271af0 "Chris"\0
      CUR = 5
      LEN = 6
      MAGIC = 0x853dd80
        MG_VIRTUAL = &vtbl_mglob
        MG_TYPE = 'g'
        MG_LEN = 1
  
  And in the second window:
  
    PID: 27041
    pos: 2
    SV = PVMG(0x853db20) at 0x8250e8c
      REFCNT = 3
      FLAGS = (PADBUSY,PADMY,SMG,POK,pPOK)
      IV = 0
      NV = 0
      PV = 0x8271af0 "Chris"\0
      CUR = 5
      LEN = 6
      MAGIC = 0x853dd80
        MG_VIRTUAL = &vtbl_mglob
        MG_TYPE = 'g'
        MG_LEN = 2
  
  We see that all the addresses of the supposedly big structure are the
  same (C<0x8250e8c> and C<0x8271af0>), therefore the variable data
  structure is almost completely shared.  The only difference is in
  C<SV.MAGIC.MG_LEN> record, which is not shared.
  
  So given that the C<$readonly> variable is a big one, its value is
  still shared between the processes, while part of the variable data
  structure is non-shared.  But it's almost insignificant because it
  takes a very little memory space.
  
  Now if you need to compare more than variable, doing it by hand can be
  quite time consuming and error prune.  Therefore it's better to
  correct the testing script to dump the Perl data-types into files (e.g
  I</tmp/dump.$$>, where C<$$> is the PID of the process) and then using
  diff(1) utility to see whether there is some difference.
  
  So correcting the dump() function to write the info to the file will
  do the job. Notice that we use C<Devel::Peek> and not
  C<Apache::Peek>. The both are almost the same, but C<Apache::Peek>
  prints it output directly to the opened socket so we cannot intercept
  and redirect the result to the file. Since C<Devel::Peek> dumps
  results to the STDERR stream we can use the old trick of saving away
  the default STDERR handler, and open a new filehandler using the
  STDERR. In our example when C<Devel::Peek> now prints to STDERR it
  actually prints to our file. When we are done, we make sure to restore
  the original STDERR filehandler. 
  
  So this is the resulting code:
  
    MyShared2.pm
    ---------
    package MyShared2;
    use Devel::Peek;
    
    my $readonly = "Chris";
    
    sub match    { $readonly =~ /\w/g;               }
    sub print_pos{ print "pos: ",pos($readonly),"\n";}
    sub dump{
      my $dump_file = "/tmp/dump.$$";
      print "Dumping the data into $dump_file\n";
      open OLDERR, ">&STDERR";
      open STDERR, ">".$dump_file or die "Can't open $dump_file: $!";
      Dump($readonly);
      close STDERR ;
      open STDERR, ">&OLDERR";
    }
    1;
  
  When if we modify the code to use the modified module:
  
    share_test2.pl
    -------------
    use MyShared2;
    print "Content-type: text/plain\r\n\r\n";
    print "PID: $$\n";
    MyShared2::match();
    MyShared2::print_pos();
    MyShared2::dump();
  
  And run it as before (with S<MaxClients 2>), two dump files will be
  created in the directory I</tmp>.  In our test these were created as
  I</tmp/dump.1224> and I</tmp/dump.1225>. When we run diff(1):
  
    % diff /tmp/dump.1224 /tmp/dump.1225
    12c12
    <       MG_LEN = 1
    ---
    >       MG_LEN = 2
  
  We see that the two padlists (of the variable C<readonly>) are
  different, as we have observed before when we did a manual comparison.
  
  In fact we if we think about these results again, we get to a
  conclusion that there is no need for two processes to find out whether
  the variable gets modified (and therefore unshared). It's enough to
  check the datastructure before the script was executed and after that.
  You can modify the C<MyShared2> module to dump the padlists into a
  different file after each invocation and than to run the diff(1) on
  the two files.
  
  If you want to watch whether some lexically scoped (with my())
  variables in your C<Apache::Registry> script inside the same process
  get changed between invocations you can use the
  C<Apache::RegistryLexInfo> module instead.  Since it does exactly
  this: it makes a snapshot of the padlist before and after the code
  execution and shows the difference between the two.  This specific
  module was written to work with C<Apache::Registry> scripts so it
  won't work for loaded modules. Use the technique we have described
  above for any type of variables in modules and scripts.
  
  Surely another way of ensuring that a scalar is readonly and therefore
  sharable is to either use the C<constant> pragma or C<readonly>
  pragma.  But then you won't be able to make calls that alter the
  variable even a little, like in the example that we just showed,
  because it will be a true constant variable and you will get compile
  time error if you try this:
  
    MyConstant.pm
    -------------
    package MyConstant;
    use constant readonly => "Chris";
    
    sub match    { readonly =~ /\w/g;               }
    sub print_pos{ print "pos: ",pos(readonly),"\n";}
    1;
  
    % perl -c MyConstant.pm
  
    Can't modify constant item in match position at MyConstant.pm line
    5, near "readonly)"
    MyConstant.pm had compilation errors.
  
  However this code is just right:
  
    MyConstant1.pm
    -------------
    package MyConstant1;
    use constant readonly => "Chris";
    
    sub match { readonly =~ /\w/g; }
    1;
  
  
  
  =head3 Preloading Perl Modules at Server Startup
  
  You can use the C<PerlRequire> and C<PerlModule> directives to load
  commonly used modules such as C<CGI.pm>, C<DBI> and etc., when the
  server is started.  On most systems, server children will be able to
  share the code space used by these modules.  Just add the following
  directives into I<httpd.conf>:
  
    PerlModule CGI
    PerlModule DBI
  
  But an even better approach is to create a separate startup file
  (where you code in plain perl) and put there things like:
  
    use DBI ();
    use Carp ();
  
  Don't forget to prevent importing of the symbols exported by default
  by the module you are going to preload, by placing empty parentheses
  C<()> after a module's name.  Unless you need some of these in the
  startup file, which is unlikely.  This will save you a few more memory
  bits.
  
  Then you C<require()> this startup file in I<httpd.conf> with the
  C<PerlRequire> directive, placing it before the rest of the mod_perl
  configuration directives:
  
    PerlRequire /path/to/start-up.pl
  
  C<CGI.pm> is a special case.  Ordinarily C<CGI.pm> autoloads most of
  its functions on an as-needed basis.  This speeds up the loading time
  by deferring the compilation phase.  When you use mod_perl, FastCGI or
  another system that uses a persistent Perl interpreter, you will want
  to precompile the functions at initialization time.  To accomplish
  this, call the package function compile() like this:
  
    use CGI ();
    CGI->compile(':all');
  
  The arguments to C<compile()> are a list of method names or sets, and
  are identical to those accepted by the C<use()> and C<import()>
  operators.  Note that in most cases you will want to replace C<':all'>
  with the tag names that you actually use in your code, since generally
  you only use a subset of them.
  
  Let's conduct a memory usage test to prove that preloading, reduces
  memory requirements.
  
  In order to have an easy measurement we will use only one child
  process, therefore we will use this setting:
  
    MinSpareServers 1
    MaxSpareServers 1
    StartServers 1
    MaxClients 1
    MaxRequestsPerChild 100
  
  We are going to use the C<Apache::Registry> script I<memuse.pl> which
  consists of two parts: the first one preloads a bunch of modules (that
  most of them aren't going to be used), the second part reports the
  memory size and the shared memory size used by the single child
  process that we start. and of course it prints the difference between
  the two sizes.
  
    memuse.pl
    ---------
    use strict;
    use CGI ();
    use DB_File ();
    use LWP::UserAgent ();
    use Storable ();
    use DBI ();
    use GTop ();
  
    my $r = shift;
    $r->send_http_header('text/plain');
    my $proc_mem = GTop->new->proc_mem($$);
    my $size  = $proc_mem->size;
    my $share = $proc_mem->share;
    my $diff  = $size - $share;
    printf "%10s %10s %10s\n", qw(Size Shared Difference);
    printf "%10d %10d %10d (bytes)\n",$size,$share,$diff;
  
  First we restart the server and execute this CGI script when none of
  the above modules preloaded. Here is the result:
  
       Size   Shared     Diff
    4706304  2134016  2572288 (bytes)
  
  Now we take all the modules:
  
    use strict;
    use CGI ();
    use DB_File ();
    use LWP::UserAgent ();
    use Storable ();
    use DBI ();
    use GTop ();
  
  and copy them into the startup script, so they will get preloaded.
  The script remains unchanged.  We restart the server and execute it
  again. We get the following.
  
       Size   Shared    Diff
    4710400  3997696  712704 (bytes)
  
  Let's put the two results into one table:
  
    Preloading    Size   Shared     Diff
       Yes     4710400  3997696   712704 (bytes)
        No     4706304  2134016  2572288 (bytes)
    --------------------------------------------
    Difference    4096  1863680 -1859584
  
  You can clearly see that when the modules weren't preloaded the shared
  memory pages size, were about 1864Kb smaller relative to the case
  where the modules were preloaded.
  
  Assuming that you have had 256M dedicated to the web server, if you
  didn't preload the modules, you could have:
  
    268435456 = X * 2572288 + 2134016
  
    X = (268435456 - 2134016) / 2572288 = 103 
  
  103 servers.
  
  Now let's calculate the same thing with modules preloaded:
  
    268435456 = X * 712704 + 3997696
  
    X = (268435456 - 3997696) / 712704 = 371
  
  You can have almost 4 times more servers!!!
  
  Remember that we have mentioned before that memory pages gets dirty
  and the size of the shared memory gets smaller with time? So we have
  presented the ideal case where the shared memory stays
  intact. Therefore the real numbers will be a little bit different, but
  not far from the numbers in our example.
  
  Also it's obvious that in your case it's possible that the process
  size will be bigger and the shared memory will be smaller, since you
  will use different modules and a different code, so you won't get this
  fantastic ratio, but this example is certainly helps to feel the
  difference.
  
  
  
  
  =head3 Preloading Registry Scripts at Server Startup
  
  What happens if you find yourself stuck with Perl CGI scripts and you
  cannot or don't want to move most of the stuff into modules to benefit
  from modules preloading, so the code will be shared by the children.
  Luckily you can preload scripts as well.  This time the
  C<Apache::RegistryLoader> modules comes to aid.
  C<Apache::RegistryLoader> compiles C<Apache::Registry> scripts at
  server startup.
  
  For example to preload the script I</perl/test.pl> which is in fact
  the file I</home/httpd/perl/test.pl> you would do the following:
  
    use Apache::RegistryLoader ();
    Apache::RegistryLoader->new->handler("/perl/test.pl",
                            "/home/httpd/perl/test.pl");
  
  You should put this code either into C<E<lt>PerlE<gt>> sections or
  into a startup script.
  
  But what if you have a bunch of scripts located under the same
  directory and you don't want to list them one by one.  Take the
  benefit of Perl modules and put them to a good use.  The C<File::Find>
  module will do most of the work for you.
  
  The following code walks the directory tree under which all
  C<Apache::Registry> scripts are located.  For each encountered file
  with extension I<.pl>, it calls the
  C<Apache::RegistryLoader::handler()> method to preload the script in
  the parent server, before pre-forking the child processes:
  
    use File::Find qw(finddepth);
    use Apache::RegistryLoader ();
    {
      my $scripts_root_dir = "/home/httpd/perl/";
      my $rl = Apache::RegistryLoader->new;
      finddepth
        (
         sub {
           return unless /\.pl$/;
           my $url = "$File::Find::dir/$_";
           $url =~ s|$scripts_root_dir/?|/|;
           warn "pre-loading $url\n";
             # preload $url
           my $status = $rl->handler($url);
           unless($status == 200) {
           warn "pre-load of `$url' failed, status=$status\n";
           }
         },
         $scripts_root_dir);
    }
  
  Note that we didn't use the second argument to C<handler()> here, as
  in the first example. To make the loader smarter about the URI to
  filename translation, you might need to provide a C<trans()> function
  to translate the URI to filename.  URI to filename translation
  normally doesn't happen until HTTP request time, so the module is
  forced to roll its own translation.  If filename is omitted and a
  C<trans()> function was not defined, the loader will try using the URI
  relative to B<ServerRoot>.
  
  A simple trans() function can be something like that:
  
    sub mytrans {
      my $uri = shift;
      $uri =~ s|^/perl/|/home/httpd/perl/|;
      return $uri;
    }
  
  You can easily derive the right translation by looking at the C<Alias>
  directive. The above mytrans() function is matching our C<Alias>:
  
    Alias /perl/ /home/httpd/perl/
  
  After defining the URI to filename translation function you should
  pass it during the creation of the C<Apache::RegistryLoader> object:
  
    my $rl = Apache::RegistryLoader->new(trans => \&mytrans);
  
  I won't show any benchmarks here, since the effect is absolutely the
  same as with preloading modules.
  
  See also L<BEGIN blocks|porting/BEGIN_blocks>
  
  
  =head3 Modules Initializing at Server Startup
  
  We have just learned that it's important to preload the modules and
  scripts at the server startup.  It turns out that it's not enough for
  some modules and you have to prerun their initialization code to get
  more memory pages shared. Basically you will find an information about
  specific modules in their respective manpages. We will present a few
  examples of widely used modules where the code can be initialized.
  
  
  
  =head4 Initializing DBI.pm
  
  The first example is the C<DBI> module. As you know C<DBI> works with
  many database drivers falling into the C<DBD::> category,
  e.g. C<DBD::mysql>. It's not enough to preload C<DBI>, you should
  initialize C<DBI> with driver(s) that you are going to use (usually a
  single driver is used), if you want to minimize memory use after
  forking the child processes. Note that you want to do this under
  mod_perl and other environments where the shared memory is very
  important. Otherwise you shouldn't initialize drivers.
  
  You probably know already that under mod_perl you should use the
  C<Apache::DBI> module to get the connection persistence, unless you
  open a separate connection for each user--in this case you should not
  use this module. C<Apache::DBI> automatically loads C<DBI> and
  overrides some of its methods, so you should continue coding like
  there is only a C<DBI> module.
  
  Just as with modules preloading our goal is to find the startup
  environment that will lead to the smallest I<"difference"> between the
  shared and normal memory reported, therefore a smaller total memory
  usage.
  
  And again in order to have an easy measurement we will use only one
  child process, therefore we will use this setting in I<httpd.conf>:
  
    MinSpareServers 1
    MaxSpareServers 1
    StartServers 1
    MaxClients 1
    MaxRequestsPerChild 100
  
  We always preload these modules:
  
    use Gtop();
    use Apache::DBI(); # preloads DBI as well
  
  We are going to run memory benchmarks on five different versions of
  the I<startup.pl> file. 
  
  =over
  
  =item option 1
  
  Leave the file unmodified.
  
  =item option 2
  
  Install MySQL driver (we will use MySQL RDBMS for our test):
  
    DBI->install_driver("mysql");
  
  It's safe to use this method, since just like with C<use()>, if it
  can't be installed it'll die().
  
  =item option 3
  
  Preload MySQL driver module:
  
    use DBD::mysql;
  
  =item option 4
  
  Tell C<Apache::DBI> to connect to the database when the child process
  starts (C<ChildInitHandler>), no driver is preload before the child
  gets spawned!
  
    Apache::DBI->connect_on_init('DBI:mysql:test::localhost',
                               "",
                               "",
                               {
                                PrintError => 1, # warn() on errors
                                RaiseError => 0, # don't die on error
                                AutoCommit => 1, # commit executes
                                # immediately
                               }
                              )
    or die "Cannot connect to database: $DBI::errstr";
  
  =item option 5
  
  Options 2 and 4: using connect_on_init() and install_driver().
  
  =back
  
  Here is the C<Apache::Registry> test script that we have used:
  
    preload_dbi.pl
    --------------
    use strict;
    use GTop ();
    use DBI ();
      
    my $dbh = DBI->connect("DBI:mysql:test::localhost",
                           "",
                           "",
                           {
                            PrintError => 1, # warn() on errors
                            RaiseError => 0, # don't die on error
                            AutoCommit => 1, # commit executes
                                             # immediately
                           }
                          )
      or die "Cannot connect to database: $DBI::errstr";
    
    my $r = shift;
    $r->send_http_header('text/plain');
    
    my $do_sql = "show tables";
    my $sth = $dbh->prepare($do_sql);
    $sth->execute();
    my @data = ();
    while (my @row = $sth->fetchrow_array){
      push @data, @row;
    }
    print "Data: @data\n";
    $dbh->disconnect(); # NOP under Apache::DBI
    
    my $proc_mem = GTop->new->proc_mem($$);
    my $size  = $proc_mem->size;
    my $share = $proc_mem->share;
    my $diff  = $size - $share;
    printf "%8s %8s %8s\n", qw(Size Shared Diff);
    printf "%8d %8d %8d (bytes)\n",$size,$share,$diff;
  
  The script opens a opens a connection to the database I<'test'> and
  issues a query to learn what tables the databases has.  When the data
  is collected and printed the connection would be closed in the regular
  case, but C<Apache::DBI> overrides it with empty method.  When the
  data is processed a familiar to you already code to print the memory
  usage follows.
  
  The server was restarted before each new test.
  
  So here are the results of the five tests that were conducted, sorted
  by the I<Diff> column:
  
  =over
  
  =item 1
  
  After the first request:
  
    Test type                              Size   Shared     Diff
    --------------------------------------------------------------
    install_driver (2)                   3465216  2621440   843776
    install_driver & connect_on_init (5) 3461120  2609152   851968
    preload driver (3)                   3465216  2605056   860160
    nothing added (1)                    3461120  2494464   966656
    connect_on_init (4)                  3461120  2482176   978944
  
  =item 2
  
  After the second request (all the subsequent request showed the same
  results):
  
    Test type                              Size   Shared     Diff
    --------------------------------------------------------------
    install_driver (2)                   3469312  2609152   860160
    install_driver & connect_on_init (5) 3481600  2605056   876544
    preload driver (3)                   3469312  2588672   880640
    nothing added  (1)                   3477504  2482176   995328
    connect_on_init (4)                  3481600  2469888  1011712
  
  =back
  
  Now what do we conclude from looking at these numbers. First we see
  that only after a second reload we get the final memory footprint for
  a specific request in question (if you pass different arguments the
  memory usage might and will be different).
  
  But both tables show the same pattern of memory usage.  We can clearly
  see that the real winner is the I<startup.pl> file's version where the
  MySQL driver was installed (2).  Since we want to have a connection
  ready for the first request made to the freshly spawned child process,
  we generally use the version (5) which uses somewhat more memory, but
  has almost the same number of shared memory pages.  The version (3)
  only preloads the driver which results in smaller shared memory.  The
  last two versions having nothing initialized (1) and having only the
  connect_on_init() method used (4).  The former is a little bit better
  than the latter, but both significantly worse than the first two
  versions.
  
  To remind you why do we look for the smallest value in the column
  I<diff>, recall the real memory usage formula:
  
    RAM_dedicated_to_mod_perl = diff * number_of_processes
                              + the_processes_with_largest_shared_memory
  
  Notice that the smaller the diff is, the bigger the number of
  processes you can have using the same amount of RAM.  Therefore every
  100K difference counts, when you multiply it by the number of
  processes. If we take the number from the version (2) vs. (4) and
  assume that we have 256M of memory dedicated to mod_perl processes we
  will get the following numbers using the formula derived from the
  above formula:
  
                 RAM - largest_shared_size
    N_of Procs = -------------------------
                          Diff
  
                  268435456 - 2609152
    (ver 2)  N =  ------------------- = 309
                        860160
  
                  268435456 - 2469888
    (ver 4)  N =  ------------------- = 262
                       1011712
  
  So you can tell the difference (17% more child processes in the first
  version).
  
  
  
  =head4 Initializing CGI.pm
  
  C<CGI.pm> is a big module that by default postpones the compilation of
  its methods until they are actually needed, thus making it possible to
  use it under a slow mod_cgi handler without adding a big
  overhead. That's not what we want under mod_perl and if you use
  C<CGI.pm> you should precompile the methods that you are going to use
  at the server startup in addition to preloading the module.  Use the
  compile method for that:
  
    use CGI;
    CGI->compile(':all');
  
  where you should replace the tag group C<:all> with the real tags and
  group tags that you are going to use if you want to optimize the
  memory usage.
  
  We are going to compare the shared memory foot print by using the
  script which is back compatible with mod_cgi. You will see that you
  can improve performance of this kind of scripts as well, but if you
  really want a fast code think about porting it to use
  C<Apache::Request> for CGI interface and some other module for HTML
  generation.
  
  So here is the C<Apache::Registry> script that we are going to use to
  make the comparison:
  
    preload_cgi_pm.pl
    -----------------
    use strict;
    use CGI ();
    use GTop ();
  
    my $q = new CGI;
    print $q->header('text/plain');
    print join "\n", map {"$_ => ".$q->param($_) } $q->param;
    print "\n";
    
    my $proc_mem = GTop->new->proc_mem($$);
    my $size  = $proc_mem->size;
    my $share = $proc_mem->share;
    my $diff  = $size - $share;
    printf "%8s %8s %8s\n", qw(Size Shared Diff);
    printf "%8d %8d %8d (bytes)\n",$size,$share,$diff;
  
  The script initializes the C<CGI> object, sends HTTP header and then
  print all the arguments and values that were passed to the script if
  at all. At the end as usual we print the memory usage.
  
  As usual we are going to use a single child process, therefore we will
  use this setting in I<httpd.conf>:
  
    MinSpareServers 1
    MaxSpareServers 1
    StartServers 1
    MaxClients 1
    MaxRequestsPerChild 100
  
  We are going to run memory benchmarks on three different versions of
  the I<startup.pl> file.  We always preload this module:
  
    use Gtop();
  
  =over
  
  =item option 1
  
  Leave the file unmodified.
  
  =item option 2
  
  Preload C<CGI.pm>:
  
    use CGI ();
  
  =item option 3
  
  Preload C<CGI.pm> and pre-compile the methods that we are going to use
  in the script:
  
    use CGI ();
    CGI->compile(qw(header param));
  
  =back
  
  The server was restarted before each new test.
  
  So here are the results of the five tests that were conducted, sorted
  by the I<Diff> column:
  
  =over
  
  =item 1
  
  After the first request:
  
    Version     Size   Shared     Diff        Test type
    --------------------------------------------------------------------
          1  3321856  2146304  1175552  not preloaded
          2  3321856  2326528   995328  preloaded
          3  3244032  2465792   778240  preloaded & methods+compiled
  
  =item 2
  
  After the second request (all the subsequent request showed the same
  results):
  
    Version     Size   Shared    Diff         Test type
    --------------------------------------------------------------------
          1  3325952  2134016  1191936 not preloaded
          2  3325952  2314240  1011712 preloaded
          3  3248128  2445312   802816 preloaded & methods+compiled
  
  =back
  
  The first version shows the results of the script execution when
  C<CGI.pm> wasn't preloaded. The second version with module
  preloaded. The third when it's both preloaded and the methods that are
  going to be used are precompiled at the server startup.
  
  By looking at the version one of the second table we can conclude
  that, preloading adds about 20K of shared size. As we have mention at
  the beginning of this section that's how C<CGI.pm> was implemented--to
  reduce the load overhead. Which means that preloading CGI is almost
  hardly change a thing. But if we compare the second and the third
  versions we will see a very significant difference of 207K
  (1011712-802816), and we have used only a few methods (the I<header>
  method loads a few more method transparently for a user). Imagine how
  much memory we are going to save if we are going to precompile all the
  methods that we are using in other scripts that use C<CGI.pm> and do a
  little bit more than the script that we have used in the test.
  
  But even in our very simple case using the same formula, what do we
  see? (assuming that we have 256MB dedicated for mod_perl)
  
                 RAM - largest_shared_size
    N_of Procs = -------------------------
                          Diff
  
                  268435456 - 2134016
    (ver 1)  N =  ------------------- = 223
                        1191936
  
                  268435456 - 2445312
    (ver 3)  N =  ------------------- = 331
                       802816
  
  If we preload C<CGI.pm> and precompile a few methods that we use in
  the test script, we can have 50% more child processes than when we
  don't preload and precompile the methods that we are going to use.
  
  META: I've heard that the 3.x generation will be less bloated, so
  probably I'll have to rerun this using the new version.
  
  
  
  
  
  
  
  
  
  =head2 Increasing Shared Memory With mergemem
  
  C<mergemem> is an experimental utility for linux, which looks I<very>
  interesting for us mod_perl users:
  http://www.complang.tuwien.ac.at/ulrich/mergemem/
  
  It looks like it could be run periodically on your server to find and
  merge duplicate pages. It won't halt your httpds during the merge,
  this aspect has been taken into consideration already during the
  design of mergemem: Merging is not performed with one big systemcall.
  Instead most operation is in userspace, making a lot of small
  systemcalls.
  
  Therefore blocking of the system should not happen.  And, if it really
  should turn out to take too much time you can reduce the priority of
  the process.
  
  The worst case that can happen is this: C<mergemem> merges two pages
  and immediately afterwards they will be split.  The split costs about
  the same as the time consumed by merging.
  
  This software comes with a utility called C<memcmp> to tell you how
  much you might save.
  
  
  =head2 Forking and Executing Subprocesses from mod_perl
  
  It's desirable to avoid forking under mod_perl. Since when you do, you
  are forking the entire Apache server, lock, stock and barrel.  Not
  only is your Perl code and Perl interpreter being duplicated, but so
  is mod_ssl, mod_rewrite, mod_log, mod_proxy, mod_speling (it's not a
  typo!) or whatever modules you have used in your server, all the core
  routines, etc.
  
  Modern Operating Systems come with a very light version of fork which
  adds a little overhead when called, since it was optimized to do the
  absolute minimum of memory pages duplications. The I<copy-on-write>
  technique is the one that allows to do so. The gist of this technique
  is as follows: the parent process memory pages aren't immediately
  copied to the child's space on fork(), but this is done only when the
  child or the parent modifies the data in some memory pages. Before the
  pages get modified they get marked as dirty and the child has no
  choice but to copy the pages that are to be modified since they cannot
  be shared any more.
  
  If you need to call a Perl program from your mod_perl code, it's
  better to try to covert the program into a module and call it a
  function without spawning a special process to do that. Of course if
  you cannot do that or the program is not written in Perl, you have to
  call via system() or is equivalent, which spawn a new process. If the
  program written in C, you may try to write a Perl glue code with help
  of XS or SWIG architectures, and then the program will be executed as
  a perl subroutine.
  
  Also by trying to spawn a sub-process, you might be trying to do the
  I<"wrong thing">.  If what you really want is to send information to
  the browser and then do some post-processing, look into the
  C<PerlCleanupHandler> directive. The latter allows you to tell the
  child process after request has been processed and user has received
  the response. This doesn't release the mod_perl process to serve other
  requests, but it allows to send the response to the client faster. If
  this is the situation and you need to run some cleanup code, you may
  want to register this code during the request processing via:
  
    my $r = shift;
    $r->register_cleanup(\&do_cleanup);
    sub do_cleanup{ #some clean-up code here }
  
  But when a long term process needs to be spawned, there is not much
  choice, but to use fork(). We cannot just run this long term process
  within Apache process, since it'll first keep the Apache process busy,
  instead of letting it do the job it was designed for. And second, if
  Apache will be stopped the long term process might be terminated as
  well, unless coded properly to detach from Apache processes group.
  
  In the following sections we are going to discuss how to properly
  spawn new processes under mod_perl.
  
  =head3 Forking a New Process
  
  This is a typical way to call fork() under mod_perl:
  
    defined (my $kid = fork) or die "Cannot fork: $!\n";
    if ($kid) {
      # Parent runs this block
    } else {
      # Child runs this block
      # some code comes here
      CORE::exit(0);
    }
    # possibly more code here usually run by the parent
  
  When using fork(), you should check its return value, since if it
  returns C<undef> it means that the call was unsuccessful and no
  process was spawned. Something that can happen when the system is
  running too many processes and cannot spawn new ones.
  
  When the process is successfully forked--the parent receives the PID
  of the newly spawned child as a returned value of the fork() call and
  the child receives 0. Now the program splits into two. In the above
  example the code inside the first block after I<if> will be executed
  by the parent and the code inside the first block after I<else> will
  be executed by the child process.
  
  It's important not to forget to explicitly call exit() at the end of
  the child code when forking. Since if you don't and there is some code
  outside the I<if/else block>, the child process will execute it as
  well. But under mod_perl there is another nuance--you must use
  C<CORE::exit()> and not C<exit()>, which would be automatically
  overridden by C<Apache::exit()> if used in conjunction with
  C<Apache::Registry> and similar modules. And we want the spawned
  process to quit when its work is done, otherwise it'll just stay alive
  use resources and do nothing.
  
  The parent process usually completes its execution path and enters the
  pool of free servers to wait for a new assignment. If the execution
  path is to be aborted earlier for some reason one should use
  Apache::exit() or die(), in the case of C<Apache::Registry> or
  C<Apache::PerlRun> handlers a simple exit() will do the right thing.
  
  The child shares with parent its memory pages until it has to modify
  some of them, which triggers a I<copy-on-write> process which copies
  these pages to the child's domain before the child is allowed to
  modify them.  But this all happens afterwards.  At the moment the
  fork() call executed, the only work to be done before the child
  process goes on its separate way is setting up the page tables for the
  virtual memory, which imposes almost no delay at all.
  
  =head3 Freeing the Parent Process
  
  In the child code you must also close all the pipes to the connection
  socket that were opened by the parent process (i.e. C<STDIN> and
  C<STDOUT>) and inherited by the child, so the parent will be able to
  complete the request and free itself for serving other requests. If
  you need the C<STDIN> and/or C<STDOUT> streams you should re-open
  them.  You may need to close or re-open the C<STDERR> filehandle.
  It's opened to append to the I<error_log> file as inherited from its
  parent, so chances are that you will want to leave it untouched.
  
  Under mod_perl, the spawned process also inherits the file descriptor
  that's tied to the socket through which all the communications between
  the server and the client happen. Therefore we need to free this
  stream in the forked process. If we don't do that, the server cannot
  be restarted while the spawned process is still running. If an attempt
  is made to restart the server you will get the following error:
  
    [Mon Dec 11 19:04:13 2000] [crit] 
    (98)Address already in use: make_sock:
      could not bind to address 127.0.0.1 port 8000
  
  C<Apache::SubProcess> comes to help and provides a method
  cleanup_for_exec() which takes care of closing this file descriptor.
  
  So the simplest way is to freeing the parent process is to close all
  three STD* streams if we don't need them and untie the Apache
  socket. In addition you may want to change process' current directory
  to I</> so the forked process won't keep the mounted partition busy,
  if this is to be unmounted at a later time. To summarize all this
  issues, here is an example of the fork that takes care of freeing the
  parent process.
  
    use Apache::SubProcess;
    defined (my $kid = fork) or die "Cannot fork: $!\n";
    if ($kid) {
      # Parent runs this block
    } else {
      # Child runs this block
        $r->cleanup_for_exec(); # untie the socket
        chdir '/' or die "Can't chdir to /: $!";
        close STDIN;
        close STDOUT;
        close STDERR;
    
      # some code comes here
    
        CORE::exit(0);
    }
    # possibly more code here usually run by the parent
  
  Of course between the freeing the parent code and child process
  termination the real code is to be placed.
  
  =head3 Detaching the Forked Process
  
  Now what happens if the forked process is running and we decided that
  we need to restart the web-server? This forked process will be
  aborted, since when parent process will die during the restart it'll
  kill its child processes as well. In order to avoid this we need to
  detach the process from its parent session, by opening a new session
  with help of setsid() system call, provided by the C<POSIX> module:
  
    use POSIX 'setsid';
    
    defined (my $kid = fork) or die "Cannot fork: $!\n";
    if ($kid) {
      # Parent runs this block
    } else {
      # Child runs this block
        setsid or die "Can't start a new session: $!";
        ...
    }
  
  Now the spawned child process has a life of its own, and it doesn't
  depend on the parent anymore.
  
  =head3 Avoiding Zombie Processes
  
  Now let's talk about zombie processes.
  
  Normally, every process has its parent.  Many processes are children
  of the C<init> process, whose C<PID> is C<1>. When you fork a process
  you must wait() or waitpid() for it to finish.  If you don't wait()
  for it, it becomes a zombie.
  
  A zombie is a process that doesn't have a parent.  When the child
  quits, it reports the termination to its parent.  If no parent wait()s
  to collect the exit status of the child, it gets I<"confused"> and
  becomes a ghost process, that can be seen as a process, but not
  killed.  It will be killed only when you stop the parent process that
  spawned it!
  
  Generally the ps(1) utility displays these processes with the
  C<E<lt>defuncE<gt>> tag, and you will see the zombies counter
  increment when doing top().  These zombie processes can take up system
  resources and are generally undesirable.
  
  So the proper way to do a fork is:
  
    my $r = shift;
    $r->send_http_header('text/plain');
    
    defined (my $kid = fork) or die "Cannot fork: $!";
    if ($kid) {
      waitpid($kid,0);
      print "Parent has finished\n";
    } else {
        # do something
        CORE::exit(0);
    }
  
  In most cases the only reason you would want to fork is when you need
  to spawn a process that will take a long time to complete.  So if the
  Apache process that spawns this new child process has to wait for it
  to finish, you have gained nothing. You can neither wait for its
  completion (because you don't have the time to), nor continue because
  you will get yet another zombie process. This is called a blocking
  call, since the process is blocked to do anything else before this
  call gets completed.
  
  The simplest solution is to ignore your dead children. Just add this
  line before the fork() call:
  
    $SIG{CHLD} = 'IGNORE';
  
  When you set the C<CHLD> (C<SIGCHLD> in C) signal handler to
  C<'IGNORE'>, all the processes will be collected by the C<init> process
  and are therefore prevented from becoming zombies.  This doesn't work
  everywhere, however. It proved to work at least on Linux OS.
  
  Note that you cannot localize this setting with C<local()>.  If you
  do, it won't have the desired effect.
  
  [META: Can anyone explain why localization doesn't work?]
  
  So now the code would look like this:
  
    my $r = shift;
    $r->send_http_header('text/plain');
    
    $SIG{CHLD} = 'IGNORE';
    
    defined (my $kid = fork) or die "Cannot fork: $!\n";
    if ($kid) {
      print "Parent has finished\n";
    } else {
        # do something time-consuming
        CORE::exit(0);
    }
  
  Note that waitpid() call has gone. The S<$SIG{CHLD} = 'IGNORE';>
  statement protects us from zombies, as explained above.
  
  Another, more portable, but slightly more expensive solution is to use
  a double fork approach.
  
    my $r = shift;
    $r->send_http_header('text/plain');
    
    defined (my $kid = fork) or die "Cannot fork: $!\n";
    if ($kid) {
      waitpid($kid,0);
    } else {
      defined (my $grandkid = fork) or die "Kid cannot fork: $!\n";
      if ($grandkid) {
        CORE::exit(0);
      } else {
        # code here
        # do something long lasting
        CORE::exit(0);
      }
    }
  
  Grandkid becomes a I<"child of init">, i.e. the child of the process
  whose PID is 1.
  
  Note that the previous two solutions do allow you to know the exit
  status of the process, but in our example we didn't care about it.
  
  Another solution is to use a different I<SIGCHLD> handler:
  
    use POSIX 'WNOHANG';
    $SIG{CHLD} = sub { while( waitpid(-1,WNOHANG)>0 ) {} };
  
  Which is useful when you fork() more than one process.  The handler
  could call wait() as well, but for a variety of reasons involving the
  handling of stopped processes and the rare event in which two children
  exit at nearly the same moment, the best technique is to call
  waitpid() in a tight loop with a first argument of C<-1> and a second
  argument of C<WNOHANG>.  Together these arguments tell waitpid() to
  reap the next child that's available, and prevent the call from
  blocking if there happens to be no child ready for reaping.  The
  handler will loop until waitpid() returns a negative number or zero,
  indicating that no more reapable children remain.
  
  While you test and debug your code that uses one of the above
  examples, You might want to write some debug information to the
  error_log file so you know what happens.
  
  Read I<perlipc> manpage for more information about signal handlers.
  
  =head3 A Complete Fork Example
  
  Now let's put all the bits of code together and show a well written
  fork code that solves all the problems discussed so far. We will use
  an C<Apache::Registry> script for this purpose:
  
    proper_fork1.pl
    ---------------
    use strict;
    use POSIX 'setsid';
    use Apache::SubProcess;
    
    my $r = shift;
    $r->send_http_header("text/plain");
    
    $SIG{CHLD} = 'IGNORE';
    defined (my $kid = fork) or die "Cannot fork: $!\n";
    if ($kid) {
      print "Parent $$ has finished, kid's PID: $kid\n";
    } else {
        $r->cleanup_for_exec(); # untie the socket
        chdir '/'                or die "Can't chdir to /: $!";
        open STDIN, '/dev/null'  or die "Can't read /dev/null: $!";
        open STDOUT, '>/dev/null'
            or die "Can't write to /dev/null: $!";
        open STDERR, '>/tmp/log' or die "Can't write to /tmp/log: $!";
        setsid or die "Can't start a new session: $!";
    
        my $oldfh = select STDERR;
        local $| = 1;
        select $oldfh;
        warn "started\n";
        # do something time-consuming
        sleep 1, warn "$_\n" for 1..20;
        warn "completed\n";
    
        CORE::exit(0); # terminate the process
    }
  
  The script starts with the usual declaration of the strict mode,
  loading the C<POSIX> and C<Apache::SubProcess> modules and importing
  of the setsid() symbol from the C<POSIX> package.
  
  The HTTP header is sent next, with the I<Content-type> of
  I<text/plain>. The parent process gets ready to ignore the child, to
  avoid zombies and the fork is called.
  
  The program gets its personality split after fork and the if
  conditional evaluates to a true value for the parent process, and to a
  false value for the child process, therefore the first block is
  executed by the parent and the second by the child.
  
  The parent process announces his PID and the PID of the spawned
  process and finishes its block. If there will be any code outside it
  will be executed by the parent as well.
  
  The child process starts its code by disconnecting from the socket,
  changing its current directory to C</>, opening the STDIN and STDOUT
  streams to I</dev/null>, which in effect closes them both before
  opening. In fact in this example we don't need neither of these, so we
  could just close() both. The child process completes its disengagement
  from the parent process by opening the STDERR stream to I</tmp/log>,
  so it could write there, and creating a new session with help of
  setsid(). Now the child process has nothing to do with the parent
  process and can do the actual processing that it has to do. In our
  example it performs a simple series of warnings, which are logged into
  I</tmp/log>:
  
        my $oldfh = select STDERR;
        local $| = 1;
        select $oldfh;
        warn "started\n";
        # do something time-consuming
        sleep 1, warn "$_\n" for 1..20;
        warn "completed\n";
  
  The localized setting of C<$|=1> unbuffers the STDERR stream, so we
  can immediately see the debug output generated by the program. In fact
  this setting is not required when the output is generated by warn().
  
  Finally the child process terminates by calling:
  
        CORE::exit(0);
  
  which make sure that it won't get out of the block and run some code
  that it's not supposed to run.
  
  This code example will allow you to verify that indeed the spawned
  child process has its own life, and its parent is free as well. Simply
  issue a request that will run this script, watch that the warnings are
  started to be written into the I</tmp/log> file and issue a complete
  server stop and start. If everything is correct, the server will
  successfully restart and the long term process will still be
  running. You will know that it's still running, if the warnings will
  still be printed into the I</tmp/log> file. You may need to raise the
  number of warnings to do above 20, to make sure that you don't miss
  the end of the run.
  
  If there are only 5 warnings to be printed, you should see the
  following output in this file:
  
    started
    1
    2
    3
    4
    5
    completed
  
  =head3 Starting a Long Running External Program
  
  But what happens if we cannot just run a Perl code from the spawned
  process and we have a compiled utility, i.e. a program written in C.
  Or we have a Perl program which cannot be easily converted into a
  module, and thus called as a function. Of course in this case we have
  to use system(), exec(), qx() or C<``> (back ticks) to start it.
  
  When using any of these methods and when the I<Taint> mode is enabled,
  we must at least add the following code to untaint the I<PATH>
  environment variable and delete a few other insecure environment
  variables. This information can be found in the I<perlsec> manpage.
  
    $ENV{'PATH'} = '/bin:/usr/bin';
    delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};
  
  Now all we have to do is to reuse the code from the previous section.
  
  First we move the core program into the I<external.pl> file, add the
  shebang first line so the program will be executed by Perl, tell the
  program to run under I<Taint> mode (-T) and possibly enable the
  I<warnings> mode (-w) and make it executable:
  
    external.pl
    -----------
    #!/usr/bin/perl -Tw
    
    open STDIN, '/dev/null'  or die "Can't read /dev/null: $!";
    open STDOUT, '>/dev/null'
        or die "Can't write to /dev/null: $!";
    open STDERR, '>/tmp/log' or die "Can't write to /tmp/log: $!";
    
    my $oldfh = select STDERR;
    local $| = 1;
    select $oldfh;
    warn "started\n";
    # do something time-consuming
    sleep 1, warn "$_\n" for 1..20;
    warn "completed\n";
  
  Now we replace the code that moved into the external program with
  exec() to call it:
  
    proper_fork_exec.pl
    -------------------
    use strict;
    use POSIX 'setsid';
    use Apache::SubProcess;
    
    $ENV{'PATH'} = '/bin:/usr/bin';
    delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};
    
    my $r = shift;
    $r->send_http_header("text/html");
    
    $SIG{CHLD} = 'IGNORE';
    
    defined (my $kid = fork) or die "Cannot fork: $!\n";
    if ($kid) {
      print "Parent has finished, kid's PID: $kid\n";
    } else {
        $r->cleanup_for_exec(); # untie the socket
        chdir '/'                or die "Can't chdir to /: $!";
        open STDIN, '/dev/null'  or die "Can't read /dev/null: $!";
        open STDOUT, '>/dev/null'
            or die "Can't write to /dev/null: $!";
        open STDERR, '>&STDOUT'  or die "Can't dup stdout: $!";
        setsid or die "Can't start a new session: $!";
    
        exec "/home/httpd/perl/external.pl" or die "Cannot execute exec: $!";
    }
  
  Notice that exec() never returns unless it fails to start the
  process. Therefore you shouldn't put any code after exec()--it will be
  not executed in the case of success. Use system() or back-ticks
  instead if you want to continue doing other things in the process.
  But then you probably will want to terminate the process after the
  program has finished. So you will have to write:
  
        system "/home/httpd/perl/external.pl" or die "Cannot execute system: 
$!";
        CORE::exit(0);
  
  Another important nuance is that we have to close all STD* stream in
  the forked process, even if the called program does that.
  
  If the external program is written in Perl you may pass complicated
  data structures to it using one of the methods to serialize Perl data
  and then to restore it. The C<Storable> and C<FreezeThaw> modules come
  handy. Let's say that we have program I<master.pl> calling program
  I<slave.pl>:
  
    master.pl
    ---------
    # we are within the mod_perl code
    use Storable ();
    my @params = (foo => 1, bar => 2);
    my $params = Storable::freeze([EMAIL PROTECTED]);
    exec "./slave.pl", $params or die "Cannot execute exec: $!";
  
    slave.pl
    --------
    #!/usr/bin/perl -w
    use Storable ();
    my @params = @ARGV ? @{ Storable::thaw(shift)||[] } : ();
    # do something
  
  As you can see, I<master.pl> serializes the C<@params> data structure
  with C<Storable::freeze> and passes it to I<slave.pl> as a single
  argument. I<slave.pl> restores the it with C<Storable::thaw>, by
  shifting the first value of the C<ARGV> array if available. The
  C<FreezeThaw> module does a very similar thing.
  
  
  =head3 Starting a Short Running External Program
  
  Sometimes you need to call an external program and you cannot continue
  before this program completes its run and optionally returns some
  result. In this case the fork solution doesn't help. But we have a few
  ways to execute this program. First using system():
  
    system "perl -e 'print 5+5'"
  
  We believe that you will never call the perl interperter for doing
  this simple calculation, but for the sake of a simple example it's
  good enough.
  
  The problem with this approach is that we cannot get the results
  printed to C<STDOUT>, and that's where back-ticks or qx() come to
  help. If you use either:
  
    my $result = `perl -e 'print 5+5'`;
  
  or:
  
    my $result = qx{perl -e 'print 5+5'};
  
  the whole output of the external program will be stored in the
  C<$result> variable.
  
  Of course you can use other solutions, like opening a pipe (C<|> to
  the program) if you need to submit many arguments and more evolved
  solutions provided by other Perl modules like C<IPC::Open2> which
  allows to open a process for both reading and writing.
  
  
  
  
  
  =head3 Executing system() or exec() in the Right Way
  
  The exec() and system() system calls behave identically in the way
  they spawn a program. For example let's use system() as an
  example. Consider the following code:
  
    system("echo","Hi");
  
  Perl will use the first argument as a program to execute, find
  C</bin/echo> along the search path, invoke it directly and pass the
  I<Hi> string as an argument.
  
  Perl's system() is B<not> the C<system(3)> call [C-library].  This is
  how the arguments to system() get interpreted.  When there is a single
  argument to system(), it'll be checked for having shell
  metacharacters first (like C<*>,C<?>), and if there are any--Perl
  interpreter invokes a real shell program (S</bin/sh -c> on Unix
  platforms).  If you pass a list of arguments to system(), they will be
  not checked for metacharacters, but split into words if required and
  passed directly to the C-level C<execvp()> system call, which is more
  efficient.  That's a I<very> nice optimization.  In other words, only
  if you do:
  
    system "sh -c 'echo *'"
  
  will the operating system actually exec() a copy of C</bin/sh> to
  parse your command.  But even then since I<sh> is almost certainly
  already running somewhere, the system will notice that (via the disk
  inode reference) and replace your virtual memory page table with one
  pointing to the existing program code plus your data space, thus will
  not create this overhead.
  
  
  
  
  =head2 OS Specific Parameters for Proxying
  
  Most of the mod_perl enabled servers use a proxy front-end
  server. This is done in order to avoid serving static objects, and
  also so that generated output which might be received by slow clients
  does not cause the heavy but very fast mod_perl servers from idly
  waiting.
  
  There are very important OS parameters that you might want to change
  in order to improve the server performance. This topic is 
  discussed in the section: L<Setting the Buffering Limits on Various
  OSes|scenario/Setting_the_Buffering_Limits_on_>
  
  
  
  =head1 Performance Tuning by Tweaking Apache Configuration
  
  Correct configuration of the C<MinSpareServers>, C<MaxSpareServers>,
  C<StartServers>, C<MaxClients>, and C<MaxRequestsPerChild> parameters
  is very important.  There are no defaults.  If they are too low, you
  will under-use the system's capabilities.  If they are too high, the
  chances are that the server will bring the machine to its knees.
  
  All the above parameters should be specified on the basis of the
  resources you have.  With a plain apache server, it's no big deal if
  you run many servers since the processes are about 1Mb and don't eat a
  lot of your RAM.  Generally the numbers are even smaller with memory
  sharing.  The situation is different with mod_perl.  I have seen
  mod_perl processes of 20Mb and more.  Now if you have C<MaxClients>
  set to 50: 50x20Mb = 1Gb.  Do you have 1Gb of RAM?  Maybe not.  So how
  do you tune the parameters?  Generally by trying different
  combinations and benchmarking the server.  Again mod_perl processes
  can be of much smaller size with memory sharing.
  
  Before you start this task you should be armed with the proper weapon.
  You need the B<crashme> utility, which will load your server with the
  mod_perl scripts you possess.  You need it to have the ability to
  emulate a multiuser environment and to emulate the behavior of
  multiple clients calling the mod_perl scripts on your server
  simultaneously.  While there are commercial solutions, you can get
  away with free ones which do the same job.  You can use the
  L<ApacheBench|performance/Configuration_Tuning_with_ApacheBench> B<C<ab>> 
utility
  which comes with the Apache distribution, the L<crashme
  script|performance/the_crashme_Script> which uses
  C<LWP::Parallel::UserAgent>,
  L<httperf|performance/httperf> or
  L<http_load|performance/http_load>.
  
  It is important to make sure that you run the load generator (the
  client which generates the test requests) on a system that is more
  powerful than the system being tested.  After all we are trying to
  simulate Internet users, where many users are trying to reach your
  service at once.  Since the number of concurrent users can be quite
  large, your testing machine must be very powerful and capable of
  generating a heavy load.  Of course you should not run the clients and
  the server on the same machine.  If you do, your test results would be
  invalid.  Clients will eat CPU and memory that should be dedicated to
  the server, and vice versa.
  
  =head2 Configuration Tuning with ApacheBench
  
  We are going to use C<ApacheBench> (C<ab>) utility to tune our
  server's configuration. We will simulate 10 users concurrently
  requesting a very light script at
  C<http://www.example.com/perl/access/access.cgi>.  Each simulated user
  makes 10 requests.
  
    % ./ab -n 100 -c 10 http://www.example.com/perl/access/access.cgi
  
  The results are:
  
    Document Path:          /perl/access/access.cgi
    Document Length:        16 bytes
    
    Concurrency Level:      10
    Time taken for tests:   1.683 seconds
    Complete requests:      100
    Failed requests:        0
    Total transferred:      16100 bytes
    HTML transferred:       1600 bytes
    Requests per second:    59.42
    Transfer rate:          9.57 kb/s received
    
    Connnection Times (ms)
                  min   avg   max
    Connect:        0    29   101
    Processing:    77   124  1259
    Total:         77   153  1360
  
  The only numbers we really care about are:
  
    Complete requests:      100
    Failed requests:        0
    Requests per second:    59.42
  
  Let's raise the request load to 100 x 10 (10 users, each makes 100
  requests):
  
    % ./ab -n 1000 -c 10  http://www.example.com/perl/access/access.cgi
    Concurrency Level:      10
    Complete requests:      1000
    Failed requests:        0
    Requests per second:    139.76
  
  As expected, nothing changes -- we have the same 10 concurrent users.
  Now let's raise the number of concurrent users to 50:
  
    % ./ab -n 1000 -c 50  http://www.example.com/perl/access/access.cgi
    Complete requests:      1000
    Failed requests:        0
    Requests per second:    133.01
  
  We see that the server is capable of serving 50 concurrent users at
  133 requests per second!  Let's find the upper limit.  Using C<-n
  10000 -c 1000> failed to get results (Broken Pipe?).  Using C<-n 10000
  -c 500> resulted in 94.82 requests per second.  The server's
  performance went down with the high load.
  
  The above tests were performed with the following configuration:
  
    MinSpareServers 8
    MaxSpareServers 6
    StartServers 10
    MaxClients 50
    MaxRequestsPerChild 1500
  
  Now let's kill each child after it serves a single request.  We will
  use the following configuration:
  
    MinSpareServers 8
    MaxSpareServers 6
    StartServers 10
    MaxClients 100
    MaxRequestsPerChild 1
  
  Simulate 50 users each generating a total of 20 requests:
  
    % ./ab -n 1000 -c 50  http://www.example.com/perl/access/access.cgi
  
  The benchmark timed out with the above configuration.... I watched the
  output of B<C<ps>> as I ran it, the parent process just wasn't capable
  of respawning the killed children at that rate.  When I raised the
  C<MaxRequestsPerChild> to 10, I got 8.34 requests per second.  Very
  bad - 18 times slower!  You can't benchmark the importance of the
  C<MinSpareServers>, C<MaxSpareServers> and C<StartServers> with this
  kind of test.
  
  Now let's reset C<MaxRequestsPerChild> to 1500, but reduce
  C<MaxClients> to 10 and run the same test:
  
    MinSpareServers 8
    MaxSpareServers 6
    StartServers 10
    MaxClients 10
    MaxRequestsPerChild 1500
  
  I got 27.12 requests per second, which is better but still 4-5 times
  slower.  (I got 133 with C<MaxClients> set to 50.)
  
  B<Summary:> I have tested a few combinations of the server
  configuration variables (C<MinSpareServers>, C<MaxSpareServers>,
  C<StartServers>, C<MaxClients> and C<MaxRequestsPerChild>).  The
  results I got are as follows:
  
  C<MinSpareServers>, C<MaxSpareServers> and C<StartServers> are only
  important for user response times.  Sometimes users will have to wait a
  bit.
  
  The important parameters are C<MaxClients> and C<MaxRequestsPerChild>.
  C<MaxClients> should be not too big, so it will not abuse your
  machine's memory resources, and not too small, for if it is your users
  will be forced to wait for the children to become free to serve them.
  C<MaxRequestsPerChild> should be as large as possible, to get the full
  benefit of mod_perl, but watch your server at the beginning to make
  sure your scripts are not leaking memory, thereby causing your server
  (and your service) to die very fast.
  
  Also it is important to understand that we didn't test the response
  times in the tests above, but the ability of the server to respond
  under a heavy load of requests. If the test script was heavier, the
  numbers would be different but the conclusions very similar.
  
  The benchmarks were run with:
  
    HW: RS6000, 1Gb RAM
    SW: AIX 4.1.5 . mod_perl 1.16, apache 1.3.3
    Machine running only mysql, httpd docs and mod_perl servers.
    Machine was _completely_ unloaded during the benchmarking.
  
  After each server restart when I changed the server's configuration, I
  made sure that the scripts were preloaded by fetching a script at
  least once for every child.
  
  It is important to notice that none of the requests timed out, even if
  it was kept in the server's queue for more than a minute!  That is the
  way B<ab> works, which is OK for testing purposes but will be
  unacceptable in the real world - users will not wait for more than
  five to ten seconds for a request to complete, and the client
  (i.e. the browser) will time out in a few minutes.
  
  Now let's take a look at some real code whose execution time is more
  than a few milliseconds.  We will do some real testing and collect the
  data into tables for easier viewing.
  
  I will use the following abbreviations:
  
    NR  = Total Number of Request
    NC  = Concurrency
    MC  = MaxClients
    MRPC        = MaxRequestsPerChild
    RPS = Requests per second
  
  Running a mod_perl script with lots of mysql queries (the script under
  test is mysqld limited)
  (http://www.example.com/perl/access/access.cgi?do_sub=query_form),
  with the configuration:
  
    MinSpareServers      8
    MaxSpareServers     16
    StartServers                10
    MaxClients          50
    MaxRequestsPerChild 5000
  
  gives us:
  
       NR         NC    RPS     comment
    ------------------------------------------------
       10         10    3.33    # not a reliable figure
      100         10    3.94    
     1000         10    4.62    
     1000         50    4.09    
  
  B<Conclusions:> Here I wanted to show that when the application is
  slow (not due to perl loading, code compilation and execution, but
  limited by some external operation) it almost does not matter what
  load we place on the server.  The RPS (Requests per second) is almost
  the same.  Given that all the requests have been served, you have the
  ability to queue the clients, but be aware that anything that goes
  into the queue means a waiting client and a client (browser) that
  might time out!
  
  Now we will benchmark the same script without using the mysql (code
  limited by perl only):
  (http://www.example.com/perl/access/access.cgi), it's the same
  script but it just returns the HTML form, without making SQL queries.
  
    MinSpareServers      8
    MaxSpareServers     16
    StartServers                10
    MaxClients          50
    MaxRequestsPerChild 5000
  
       NR         NC      RPS   comment
    ------------------------------------------------
       10         10    26.95   # not a reliable figure
      100         10    30.88   
     1000         10    29.31
     1000         50    28.01
     1000        100    29.74
    10000        200    24.92
   100000  400    24.95
  
  B<Conclusions:> This time the script we executed was pure perl (not
  limited by I/O or mysql), so we see that the server serves the
  requests much faster.  You can see the number of requests per second
  is almost the same for any load, but goes lower when the number of
  concurrent clients goes beyond C<MaxClients>.  With 25 RPS, the
  machine simulating a load of 400 concurrent clients will be served in
  16 seconds.  To be more realistic, assuming a maximum of 100
  concurrent clients and 30 requests per second, the client will be
  served in 3.5 seconds.  Pretty good for a highly loaded server.
  
  Now we will use the server to its full capacity, by keeping all
  C<MaxClients> clients alive all the time and having a big
  C<MaxRequestsPerChild>, so that no child will be killed during the
  benchmarking.
  
    MinSpareServers     50
    MaxSpareServers     50
    StartServers                50
    MaxClients          50
    MaxRequestsPerChild 5000
    
       NR         NC      RPS   comment
    ------------------------------------------------
      100         10    32.05
     1000         10    33.14
     1000         50    33.17
     1000        100    31.72
    10000        200    31.60
  
  Conclusion: In this scenario there is no overhead involving the parent
  server loading new children, all the servers are available, and the
  only bottleneck is contention for the CPU.
  
  Now we will change C<MaxClients> and watch the results: Let's reduce
  C<MaxClients> to 10.
  
    MinSpareServers      8
    MaxSpareServers     10
    StartServers                10
    MaxClients          10
    MaxRequestsPerChild 5000
    
       NR         NC      RPS   comment
    ------------------------------------------------
       10         10    23.87   # not a reliable figure
      100         10    32.64 
     1000         10    32.82
     1000         50    30.43
     1000        100    25.68
     1000        500    26.95
     2000        500    32.53
  
  B<Conclusions:> Very little difference!  Ten servers were able to
  serve almost with the same throughput as 50 servers.  Why?  My guess
  is because of CPU throttling.  It seems that 10 servers were serving
  requests 5 times faster than when we worked with 50 servers.  In that
  case, each child received its CPU time slice five times less
  frequently.  So having a big value for C<MaxClients>, doesn't mean
  that the performance will be better.  You have just seen the numbers!
  
  Now we will start drastically to reduce C<MaxRequestsPerChild>:
  
    MinSpareServers      8
    MaxSpareServers     16
    StartServers                10
    MaxClients          50
  
       NR         NC    MRPC     RPS    comment
    ------------------------------------------------
      100         10      10    5.77 
      100         10       5    3.32
     1000         50      20    8.92
     1000         50      10    5.47
     1000         50       5    2.83
     1000        100      10    6.51
  
  
  B<Conclusions:> When we drastically reduce C<MaxRequestsPerChild>, the
  performance starts to become closer to plain mod_cgi. 
  
  Here are the numbers of this run with mod_cgi, for comparison:
  
    MinSpareServers      8
    MaxSpareServers     16
    StartServers                10
    MaxClients          50
    
       NR         NC    RPS     comment
    ------------------------------------------------
      100         10    1.12
     1000         50    1.14
     1000        100    1.13
  
  B<Conclusion>: mod_cgi is much slower. :) In the first test, when
  NR/NC was 100/10, mod_cgi was capable of 1.12 requests per second.  In
  the same circumstances, mod_perl was capable of 32 requests per
  second, nearly 30 times faster!  In the first test each client waited
  about 100 seconds to be served.  In the second and third tests they
  waited 1000 seconds!
  
  
  
  =head2 Choosing MaxClients
  
  The C<MaxClients> directive sets the limit on the number of
  simultaneous requests that can be supported.  No more than this number
  of child server processes will be created.  To configure more than 256
  clients, you must edit the C<HARD_SERVER_LIMIT> entry in C<httpd.h>
  and recompile.  In our case we want this variable to be as small as
  possible, because in this way we can limit the resources used by the
  server children.  Since we can restrict each child's process size (see
  L<Limiting the size of the
  processes|performance/Limiting_the_Size_of_the_Process>), the
  calculation of C<MaxClients> is pretty straightforward:
  
                 Total RAM Dedicated to the Webserver
    MaxClients = ------------------------------------
                       MAX child's process size
  
  So if I have 400Mb left for the webserver to run with, I can set
  C<MaxClients> to be of 40 if I know that each child is limited to 10Mb
  of memory (e.g. with
  L<C<Apache::SizeLimit>|performance/Limiting_the_Size_of_the_Processes>).
  
  You will be wondering what will happen to your server if there are
  more concurrent users than C<MaxClients> at any time.  This situation
  is signified by the following warning message in the C<error_log>:
  
    [Sun Jan 24 12:05:32 1999] [error] server reached MaxClients setting,
    consider raising the MaxClients setting
  
  There is no problem -- any connection attempts over the C<MaxClients>
  limit will normally be queued, up to a number based on the
  C<ListenBacklog> directive.  When a child process is freed at the end
  of a different request, the connection will be served.
  
  It B<is an error> because clients are being put in the queue rather
  than getting served immediately, despite the fact that they do not get
  an error response.  The error can be allowed to persist to balance
  available system resources and response time, but sooner or later you
  will need to get more RAM so you can start more child processes.  The
  best approach is to try not to have this condition reached at all, and
  if you reach it often you should start to worry about it.
  
  It's important to understand how much real memory a child occupies.
  Your children can share memory between them when the OS supports that.
  You must take action to allow the sharing to happen - See L<Preload
  Perl modules at server
  startup|performance/Preloading_Perl_Modules_at_Serve>.  If you do
  this, the chances are that your C<MaxClients> can be even higher.  But
  it seems that it's not so simple to calculate the absolute number.  If
  you come up with a solution please let us know!  If the shared memory
  was of the same size throughout the child's life, we could derive a
  much better formula:
  
                 Total_RAM + Shared_RAM_per_Child * (MaxClients - 1)
    MaxClients = ---------------------------------------------------
                                Max_Process_Size
  
  which is:
  
                      Total_RAM - Shared_RAM_per_Child
    MaxClients = ---------------------------------------
                 Max_Process_Size - Shared_RAM_per_Child
  
  Let's roll some calculations:
  
    Total_RAM            = 500Mb
    Max_Process_Size     =  10Mb
    Shared_RAM_per_Child =   4Mb
  
                500 - 4
   MaxClients = --------- = 82
                 10 - 4
  
  With no sharing in place
  
                   500
    MaxClients = --------- = 50
                   10
  
  With sharing in place you can have 64% more servers without buying
  more RAM.
  
  If you improve sharing and keep the sharing level, let's say:
  
    Total_RAM            = 500Mb
    Max_Process_Size     =  10Mb
    Shared_RAM_per_Child =   8Mb
   
                 500 - 8
    MaxClients = --------- = 246
                  10 - 8
  
  392% more servers! Now you can feel the importance of having as much
  shared memory as possible.
  
  
  =head2 Choosing MaxRequestsPerChild
  
  The C<MaxRequestsPerChild> directive sets the limit on the number of
  requests that an individual child server process will handle.  After
  C<MaxRequestsPerChild> requests, the child process will die.  If
  C<MaxRequestsPerChild> is 0, then the process will live forever.
  
  Setting C<MaxRequestsPerChild> to a non-zero limit solves some memory
  leakage problems caused by sloppy programming practices, whereas a
  child process consumes more memory after each request.
  
  If left unbounded, then after a certain number of requests the
  children will use up all the available memory and leave the server to
  die from memory starvation.  Note that sometimes standard system
  libraries leak memory too, especially on OSes with bad memory
  management (e.g. Solaris 2.5 on x86 arch).
  
  If this is your case you can set C<MaxRequestsPerChild> to a small
  number.  This will allow the system to reclaim the memory that a
  greedy child process consumed, when it exits after
  C<MaxRequestsPerChild> requests.
  
  But beware -- if you set this number too low, you will lose some of
  the speed bonus you get from mod_perl.  Consider using
  C<Apache::PerlRun> if this is the case.
  
  Another approach is to use the
  L<Apache::SizeLimit|performance/Limiting_the_Size_of_the_Processes> or
  the L<Apache::GTopLimit|performance/Keeping_the_Shared_Memory_Limit>
  modules.  By using either of these modules you should be able to
  discontinue using the C<MaxRequestPerChild>, although for some
  developers, using both in combination does the job. In addition the
  latter module allows you to kill any servers whose shared memory size
  drops below a specified limit.
  
  See also L<Preload Perl modules at server
  startup|performance/Preloading_Perl_Modules_at_Serve> and L<Sharing
  Memory|performance/Sharing_Memory>. 
  
  
  =head2 Choosing MinSpareServers, MaxSpareServers and StartServers
  
  With mod_perl enabled, it might take as much as 20 seconds from the
  time you start the server until it is ready to serve incoming
  requests.  This delay depends on the OS, the number of preloaded
  modules and the process load of the machine.  It's best to set
  C<StartServers> and C<MinSpareServers> to high numbers, so that if you
  get a high load just after the server has been restarted the fresh
  servers will be ready to serve requests immediately.  With mod_perl,
  it's usually a good idea to raise all 3 variables higher than normal.
  
  In order to maximize the benefits of mod_perl, you don't want to kill
  servers when they are idle, rather you want them to stay up and
  available to handle new requests immediately.  I think an ideal
  configuration is to set C<MinSpareServers> and C<MaxSpareServers> to
  similar values, maybe even the same.  Having the C<MaxSpareServers>
  close to C<MaxClients> will completely use all of your resources (if
  C<MaxClients> has been chosen to take the full advantage of the
  resources), but it'll make sure that at any given moment your system
  will be capable of responding to requests with the maximum speed
  (assuming that number of concurrent requests is not higher than
  C<MaxClients>).
  
  Let's try some numbers.  For a heavily loaded web site and a dedicated
  machine I would think of (note 400Mb is just for example):
  
    Available to webserver RAM:         400Mb
    Child's memory size bounded:  10Mb
    MaxClients:                 400/10 = 40 (larger with mem sharing)
    StartServers:                       20
    MinSpareServers:            20
    MaxSpareServers:            35
  
  However if I want to use the server for many other tasks, but make it
  capable of handling a high load, I'd think of:
  
    Available to webserver RAM:         400Mb
    Child's memory size bounded:  10Mb
    MaxClients:                 400/10 = 40
    StartServers:                       5
    MinSpareServers:            5
    MaxSpareServers:            10
  
  These numbers are taken off the top of my head, and shouldn't be used
  as a rule, but rather as examples to show you some possible scenarios.
  Use this information with caution!
  
  
  =head2 Summary of Benchmarking to tune all 5 parameters
  
  OK, we've run various benchmarks -- let's summarize the conclusions:
  
  =over 4
  
  =item * MaxRequestsPerChild
  
  If your scripts are clean and don't leak memory, set this variable to
  a number as large as possible (10000?). If you use
  C<Apache::SizeLimit>, you can set this parameter to 0 (treated as
  infinity). You will want this parameter to be smaller if your code
  becomes unshared over the process' life. And C<Apache::GTopLimit>
  comes into the picture with the shared memory limitation feature.
  
  =item *  StartServers
  
  If you keep a small number of servers active most of the time, keep
  this number low.  Keep it low especially if C<MaxSpareServers> is also
  low, as if there is no load Apache will kill its children before they
  have been utilized at all. If your service is heavily loaded, make
  this number close to C<MaxClients>, and keep C<MaxSpareServers> equal
  to C<MaxClients>.
  
  =item * MinSpareServers
  
  If your server performs other work besides web serving, make this low
  so the memory of unused children will be freed when the load is light.
  If your server's load varies (you get loads in bursts) and you want
  fast response for all clients at any time, you will want to make it
  high, so that new children will be respawned in advance and are
  waiting to handle bursts of requests.
  
  =item *  MaxSpareServers
  
  The logic is the same as for C<MinSpareServers> - low if you need the
  machine for other tasks, high if it's a dedicated web host and you
  want a minimal delay between the request and the response.
  
  =item * MaxClients
  
  Not too low, so you don't get into a situation where clients are
  waiting for the server to start serving them (they might wait, but not
  for very long).  However, do not set it too high.  With a high
  MaxClients, if you get a high load the server will try to serve all
  requests immediately.  Your CPU will have a hard time keeping up, and
  if the child size * number of running children is larger than the
  total available RAM your server will start swapping.  This will slow
  down everything, which in turn will make things even slower, until
  eventually your machine will die.  It's important that you take pains
  to ensure that swapping does not normally happen.  Swap space is an
  emergency pool, not a resource to be used routinely. If you are low on
  memory and you badly need it, buy it.  Memory is cheap.
  
  But based on the test I conducted above, even if you have plenty of
  memory like I have (1Gb), increasing C<MaxClients> sometimes will give
  you no improvement in performance.  The more clients are running, the
  more CPU time will be required, the less CPU time slices each process
  will receive.  The response latency (the time to respond to a request)
  will grow, so you won't see the expected improvement.  The best
  approach is to find the minimum requirement for your kind of service
  and the maximum capability of your machine.  Then start at the minimum
  and test like I did, successively raising this parameter until you
  find the region on the curve of the graph of latency and/or throughput
  against MaxClients where the improvement starts to diminish.  Stop
  there and use it.  When you make the measurements on a production
  server you will have the ability to tune them more precisely, since
  you will see the real numbers.
  
  Don't forget that if you add more scripts, or even just modify the
  existing ones, the processes will grow in size as you compile in more
  code.  Probably the parameters will need to be recalculated.
  
  =back
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  =head2 KeepAlive
  
  If your mod_perl server's I<httpd.conf> includes the following
  directives:
  
    KeepAlive On
    MaxKeepAliveRequests 100
    KeepAliveTimeout 15
  
  you have a real performance penalty, since after completing the
  processing for each request, the process will wait for
  C<KeepAliveTimeout> seconds before closing the connection and will
  therefore not be serving other requests during this time.  With this
  configuration you will need many more concurrent processes on a server
  with high traffic.
  
  If you use some server status reporting tools, you will see the
  process in I<K> status when it's in C<KeepAlive> status.
  
  The chances are that you don't want this feature enabled. Set it Off
  with:
  
    KeepAlive Off
  
  the other two directives don't matter if C<KeepAlive> is C<Off>.
  
  You might want to consider enabling this option if the client's
  browser needs to request more than one object from your server for a
  single HTML page.  If this is the situation the by setting
  C<KeepAlive> C<On> then for each page you save the HTTP connection
  overhead for all requests but the first one.
  
  For example if you have a page with 10 ad banners, which is not
  uncommon today, you server will work more effectively if a single
  process serves them all during a single connection.  However, your
  client will see a slightly slower response, since banners will be
  brought one at a time and not concurrently as is the case if each
  C<IMG> tag opens a separate connection.
  
  Since keepalive connections will not incur the additional three-way
  TCP handshake they are kinder to the network.
  
  SSL connections benefit the most from C<KeepAlive> in case you didn't
  configure the server to cache session ids.
  
  You have probably followed the advice to send all the requests for
  static objects to a plain Apache server.  Since most pages include
  more than one unique static image, you should keep the default
  C<KeepAlive> setting of the non-mod_perl server, i.e. keep it C<On>.
  It will probably be a good idea also to reduce the timeout a little.
  
  One option would be for the proxy/accelerator to keep the connection
  open to the client but make individual connections to the server, read
  the response, buffer it for sending to the client and close the server
  connection.  Obviously you would make new connections to the server as
  required by the client's requests.
  
  Also you should know that C<KeepAlive> requests only work with
  responses that contain a C<Content-Length> header. To send this header
  do:
  
    $r->header_out('Content-Length', $length);
  
  
  
  
  
  
  =head2 PerlSetupEnv Off
  
  C<PerlSetupEnv Off> is another optimization you might consider. This
  directive requires mod_perl 1.25 or later.
  
  I<mod_perl> fiddles with the environment to make it appear as if the
  script were being called under the CGI protocol.  For example, the
  C<$ENV{QUERY_STRING}> environment variable is initialized with the
  contents of I<Apache::args()>, and the value returned by
  I<Apache::server_hostname()> is put into C<$ENV{SERVER_NAME}>.
  
  But C<%ENV> population is expensive.  Those who have moved to the Perl
  Apache API no longer need this extra C<%ENV> population, and can gain by
  turning it B<Off>. Scripts using the C<CGI.pm> module require
  C<PerlSetupEnv On> because that module relies on a properly populated
  CGI environment table.
  
  By default it is On.
  
  Note that you can still set environment variables.  For example when
  you use the following configuration:
  
    PerlSetupEnv Off
    PerlModule Apache::RegistryNG
    <Location /perl>
      PerlSetupEnv On
      PerlSetEnv TEST hi
      SetHandler perl-script
      PerlHandler Apache::RegistryNG
      Options +ExecCGI
    </Location>
  
  and you issue a request (for example
  http://localhost/perl/setupenvoff.pl) for this script:
  
    setupenvoff.pl
    --------------
    use Data::Dumper;
    my $r = Apache->request();
    $r->send_http_header('text/plain');
    print Dumper(\%ENV);
  
  you should see something like this:
  
    $VAR1 = {
              'GATEWAY_INTERFACE' => 'CGI-Perl/1.1',
              'MOD_PERL' => 'mod_perl/1.25',
              'PATH' => '/usr/lib/perl5/5.00503:... snipped ...',
              'TEST' => 'hi'
            };
  
  Notice that we have got the value of the environment variable I<TEST>.
  
  
  
  =head2 Reducing the Number of stat() Calls Made by Apache
  
  If you watch the system calls that your server makes (using I<truss>
  or I<strace> while processing a request, you will notice that a few
  stat() calls are made. For example when I fetch
  http://localhost/perl-status and I have my DocRoot set to
  I</home/httpd/docs> I see:
  
    [snip]
    stat("/home/httpd/docs/perl-status", 0xbffff8cc) = -1 
                        ENOENT (No such file or directory)
    stat("/home/httpd/docs", {st_mode=S_IFDIR|0755, 
                                   st_size=1024, ...}) = 0
    [snip]
  
  If you have some dynamic content and your virtual relative URI is
  something like I</news/perl/mod_perl/summary> (i.e., there is no such
  directory on the web server, the path components are only used for
  requesting a specific report), this will generate five(!) stat()
  calls, before the C<DocumentRoot> is found. You will see something
  like this:
  
    stat("/home/httpd/docs/news/perl/mod_perl/summary", 0xbffff744) = -1 
                        ENOENT (No such file or directory)
    stat("/home/httpd/docs/news/perl/mod_perl",         0xbffff744) = -1
                        ENOENT (No such file or directory)
    stat("/home/httpd/docs/news/perl",                  0xbffff744) = -1
                        ENOENT (No such file or directory)
    stat("/home/httpd/docs/news",                       0xbffff744) = -1
                        ENOENT (No such file or directory)
    stat("/home/httpd/docs", 
                        {st_mode=S_IFDIR|0755, st_size=1024, ...})  =  0
  
  How expensive those calls are? Let's use the C<Time::HiRes> module to
  find out. 
  
    stat_call_sample.pl
    -------------------
    use Time::HiRes qw(gettimeofday tv_interval);
    my $calls = 1_000_000;
    
    my $start_time = [ gettimeofday ];
    
    stat "/foo" for 1..$calls;
    
    my $end_time = [ gettimeofday ];
    
    my $elapsed = tv_interval($start_time,$end_time) / $calls;
    
    print "The average execution time: $elapsed seconds\n";
  
  This script takes a time sample at the beginnig, then does 1_000_000
  C<stat()> calls to a non-existing file, samples the time at the end
  and prints the average time it took to make a single C<stat()> call.
  I'm sampling a 1M stats, so I'd get a correct average result.
  
  Before we actually run the script one should distinguish between two
  different situation. When the server is idle the time between the
  first and the last system call will be much shorter than the same time
  measured on the loaded system. That is because on the idle system, a
  process can use CPU very often, and on the loaded system lots of
  processes compete over it and each process has to wait for a longer
  time to get the same amount of CPU time.
  
  So first we run the above code on the unloaded system:
  
    % perl stat_call_sample.pl
    The average execution time: 4.209645e-06 seconds
  
  So it takes about 4 microseconds to execute a stat() call. Now let
  start a CPU intensive process in one console. The following code keeps
  CPU busy all the time.
  
    % perl -e '1**1 while 1'
  
  And now run the I<stat_call_sample.pl> script in the other console.
  
    % perl stat_call_sample.pl
    The average execution time: 8.777301e-06 seconds
  
  You can see that the average time has doubled (about 8
  microseconds). And this is obvious, since there were two processes
  competing over CPU. Now if run 4 occurrences of the above code:
  
    % perl -e '1**1 while 1' &
    % perl -e '1**1 while 1' &
    % perl -e '1**1 while 1' &
    % perl -e '1**1 while 1' &
  
  And when running our script in parallel with these processes, we get:
  
    % perl stat_call_sample.pl
    2.0853558e-05 seconds
  
  about 20 microseconds. So the average stat() system call is 5 times
  longer now. Now if you have 50 mod_perl processes that keep the CPU
  busy all the time, the stat() call will be 50 times slower and it'll
  take 0.2 milliseconds to complete a series of call. If you have five
  redundant calls as in the strace example above, they adds up to one
  millisecond. If you have more processes constantly consuming CPU, this
  time adds up. Now multiply this time by the number of processes that
  you have and you get a few seconds lost. As usual, for some services
  this loss is insignificant, while for others a very significant one.
  
  So why Apache does all these redundant C<stat()> calls?  You can blame
  the default installed C<TransHandler> for this inefficiency.  Of
  course you could supply your own, which will be smart enough not to
  look for this virtual path and immediately return C<OK>. But in cases
  where you have a virtual host that serves only dynamically generated
  documents, you can override the default C<PerlTransHandler> with this
  one:
  
    <VirtualHost 10.10.10.10:80>
      ...
      PerlTransHandler  Apache::OK
      ...
    </VirtualHost>
  
  As you see it affects only this specific virtual host.
  
  This has the effect of short circuiting the normal C<TransHandler>
  processing of trying to find a filesystem component that matches the
  given URI -- no more 'stat's!
  
  Watching your server under strace/truss can often reveal more
  performance hits than trying to optimize the code itself!
  
  For example unless configured correctly, Apache might look for the
  I<.htaccess> file in many places, if you don't have one and add many
  open() calls.
  
  Let's start with this simple configuration, and will try to reduce the
  number of irrelevant system calls.
  
    DocumentRoot "/home/httpd/docs"
    <Location /foo/test>
      SetHandler perl-script
      PerlHandler Apache::Foo
    </Location>
  
  The above configuration allows us to make a request to I</foo/test>
  and the Perl handler() defined in C<Apache::Foo> will be
  executed. Notice that in the test setup there is no file to be
  executed (like in C<Apache::Registry>). There is no I<.htaccess> file
  as well.
  
  This is a typical generated trace.
  
    stat("/home/httpd/docs/foo/test", 0xbffff8fc) = -1 ENOENT 
        (No such file or directory)
    stat("/home/httpd/docs/foo",      0xbffff8fc) = -1 ENOENT 
        (No such file or directory)
    stat("/home/httpd/docs", 
        {st_mode=S_IFDIR|0755, st_size=1024, ...}) = 0
    open("/.htaccess", O_RDONLY)                 = -1 ENOENT 
        (No such file or directory)
    open("/home/.htaccess", O_RDONLY)            = -1 ENOENT 
        (No such file or directory)
    open("/home/httpd/.htaccess", O_RDONLY)      = -1 ENOENT 
        (No such file or directory)
    open("/home/httpd/docs/.htaccess", O_RDONLY) = -1 ENOENT 
        (No such file or directory)
    stat("/home/httpd/docs/test", 0xbffff774)    = -1 ENOENT 
        (No such file or directory)
    stat("/home/httpd/docs", 
        {st_mode=S_IFDIR|0755, st_size=1024, ...}) = 0
  
  Now we modify the C<E<lt>DirectoryE<gt>> entry and add S<AllowOverride None>,
  which among other things disables I<.htaccess> files and will not try
  to open them.
  
    <Directory />
      AllowOverride None
    </Directory>
  
  We see that the four open() calls for I<.htaccess> have gone.
  
    stat("/home/httpd/docs/foo/test", 0xbffff8fc) = -1 ENOENT 
        (No such file or directory)
    stat("/home/httpd/docs/foo",      0xbffff8fc) = -1 ENOENT 
        (No such file or directory)
    stat("/home/httpd/docs", 
        {st_mode=S_IFDIR|0755, st_size=1024, ...}) = 0
    stat("/home/httpd/docs/test", 0xbffff774)    = -1 ENOENT 
        (No such file or directory)
    stat("/home/httpd/docs", 
        {st_mode=S_IFDIR|0755, st_size=1024, ...}) = 0
  
  Let's try to shortcut the I<foo> location with:
  
    Alias /foo /
  
  Which makes Apache to look for the file in the I</> directory and not
  under I</home/httpd/docs/foo>. Let's run it:
  
    stat("//test", 0xbffff8fc) = -1 ENOENT (No such file or directory)
  
  Wow, we've got only one stat call left!
  
  Let's remove the last C<Alias> setting and use:
  
      PerlTransHandler  Apache::OK
  
  as explained above. When we issue the request, we see no stat()
  calls. But this is possible only if you serve only dynamically
  generated documents, i.e. no CGI scripts. Otherwise you will have to
  write your own I<PerlTransHandler> to handle requests as desired.
  
  For example this I<PerlTransHandler> will not lookup the file on the
  filesystem if the URI starts with I</foo>, but will use the default
  I<PerlTransHandler> otherwise:
  
    PerlTransHandler 'sub { return shift->uri() =~ m|^/foo| \
                          ? Apache::OK : Apache::DECLINED;}'
  
  Let's see the same configuration using the C<E<lt>PerlE<gt>> section and a
  dedicated package:
  
    <Perl>  
      package My::Trans;
      use Apache::Constants qw(:common);
      sub handler{
         my $r = shift;
         return OK if $r->uri() =~ m|^/foo|;
         return DECLINED;
      }
  
      package Apache::ReadConfig;  
      $PerlTransHandler = "My::Trans";
    </Perl>
  
  As you see we have defined the C<My::Trans> package and implemented
  the handler() function. Then we have assigned this handler to the
  C<PerlTransHandler>.
  
  Of course you can move the code in the module into an external file,
  (e.g. I<My/Trans.pm>) and configure the C<PerlTransHandler> with 
  
    PerlTransHandler My::Trans
  
  in the normal way (no C<E<lt>PerlE<gt>> section required).
  
  
  
  
  
  
  
  
  
  
  
  =head1 TMTOWTDI: Convenience and Habit vs. Performance
  
  TMTOWTDI (sometimes pronounced I<"tim toady">), or I<"There's More
  Than One Way To Do It"> is the main motto of Perl.  In other words,
  you can gain the same goal by coding in many different styles, using
  different modules and deploying the same modules in different ways.
  
  Unfortunately when you come to the point where performance is the
  goal, you might have to learn what's more efficient and what's not.
  Of course it might mean that you will have to use something that you
  don't really like, it might be less convenient or it might be just a
  matter of habit that one should change.
  
  So this section is about performance trade-offs.  For almost each
  comparison we will provide the theoretical difference and then run
  benchmarks to support the theory, since however good the theory its
  the numbers we get in practice that matter.
  
  In the following benchmarks, unless told different the following
  Apache configuration has been used:
  
    MinSpareServers 10
    MaxSpareServers 20
    StartServers 10
    MaxClients 20
    MaxRequestsPerChild 10000
  
  =head2 Apache::Registry PerlHandler vs. Custom PerlHandler
  
  At some point you have to decide whether to use C<Apache::Registry>
  and similar handlers and stick to writing scripts for the content
  generation or to write pure Perl handlers.
  
  C<Apache::Registry> maps a request to a file and generates a
  subroutine to run the code contained in that file.  If you use a
  S<PerlHandler My::Handler> instead of C<Apache::Registry>, you have a
  direct mapping from request to subroutine, without the steps in
  between.  These steps include:
  
  =over
  
  =item 1
  
  run the stat() on the script's filename ($r-E<gt>filename)
  
  =item 1
  
  check that the file exists and is executable
  
  =item 1
  
  generate a Perl package name based on the request's URI ($r-E<gt>uri)
  
  =item 1
  
  go to the directory the script resides in (chdir basename $r-E<gt>filename)
  
  =item 1
  
  compare the file's and stored in memory compiled subroutine's last
  modified time (if it was compiled already)
  
  =item 1
  
  if modified or not compiled, compile the subroutine
  
  =item 1
  
  go back to the previous directory (chdir $old_cwd)
  
  =back
  
  If you cut out those steps, you cut out some overhead, plain and
  simple.  Do you I<need> to cut out that overhead?  May be yes, may be
  not.  Your requirements determine that.
  
  You should take a look at the sister C<Apache::Registry> modules (e.g.
  C<Apache::RegistryNG> and C<Apache::RegistryBB>) that don't perform
  all these steps, so you can still choose to stick to using scripts to
  generate the content.  The greatest added value of scripts is that you
  don't have to modify the configuration file to add the handler
  configuration and restarting the server for each newly written content
  handler.
  
  Now let's run benchmarks and compare.
  
  We want to see the overhead that C<Apache::Registry> adds compared to
  the custom handler and whether it becomes insignificant when used for
  the heavy and time consuming code.  In order to do that we will run
  two benchmarks sets: the first so called a I<light> set will use an
  almost empty script, that only sends a basic header and one word as
  content; the second will be a I<heavy> set which will add some time
  consuming operation to the script's and the handler's code.
  
  For the I<light> set we are going to use the I<registry.pl> script
  running under C<Apache::Registry>:
  
    benchmarks/registry.pl
    ----------------------
    use strict;
    print "Content-type: text/plain\r\n\r\n";
    print "Hello";
  
  And the following content generation handler:
  
    Benchmark/Handler.pm
    --------------------
    package Benchmark::Handler;
    use Apache::Constants qw(:common);
    
    sub handler{
      $r = shift;
      $r->send_http_header('text/html');
      $r->print("Hello");
      return OK;
    }
    1;
  
  We will add this settings to I<httpd.conf>:
  
    PerlModule Benchmark::Handler
    <Location /benchmark_handler>
      SetHandler perl-script
      PerlHandler Benchmark::Handler
    </Location>
  
  The first directive worries to preload and compile the
  C<Benchmark::Handler> module.  The rest of the lines tell Apache to
  execute the subroutine C<Benchmark::Handler::handler> when a request
  with relative URI I</benchmark_handler> is made.
  
  We will use the usual configuration for C<Apache::Registry> scripts,
  where all the URIs starting with I</perl> are remapped to the files
  residing under I</home/httpd/perl/> directory.
  
    Alias /perl/ /home/httpd/perl/
    <Location /perl>
      SetHandler perl-script
      PerlHandler +Apache::Registry
      Options ExecCGI
      PerlSendHeader On
    </Location>
  
  We will use the C<Apache::RegistryLoader> to preload and compile the
  script at the server startup as well, so the benchmark will be fair
  through the benchmark and only the processing time will be
  measured. To accomplish the preloading we add the following code to
  the I<startup.pl> file:
  
    use Apache::RegistryLoader ();
    Apache::RegistryLoader->new->handler(
                "/perl/benchmarks/registry.pl",
     "/home/httpd/perl/benchmarks/registry.pl");
  
  To create the I<heavy> benchmark set let's leave the above code
  examples unmodified but add some CPU intensive processing operation
  (it can be also an IO operation or a database query.)
  
    my $x = 100;
    my $y = log ($x ** 100)  for (0..10000);
  
  This code does lots of mathematical processing and therefore very CPU
  intensive.
  
  Now we are ready to proceed with the benchmark. We will generate 5000
  requests with 15 as a concurrency level using the C<Apache::Benchmark>
  module.
  
  Here are the reported results:
  
    ------------------------------
        name        | avtime   rps
    ------------------------------
    light handler   |     15   911
    light registry  |     21   680
    ------------------------------
    heavy handler   |    183    81
    heavy registry  |    191    77
    ------------------------------
  
  Let's look at the results and answer the previously asked questions.
  
  First let's compare the results from the I<light> set.  We can see
  that the average overhead added by C<Apache::Registry> (compared to
  the custom handler) is about:
  
    21 - 15 = 6 milliseconds
  
  per request.
  
  Thus the difference in speed is about 40% (15 vs. 21).  Note that this
  doesn't mean that the difference in the real world applications is
  such big. And the results of the I<heavy> set confirm that.
  
  In the I<heavy> set the average processing time is almost the same for
  the C<Apache::Registry> and the custom handler.  You can clearly see
  that the difference between the two is almost the same one that we
  have seen in the I<light> set's results.  It has grown from 6
  milliseconds to 8 milliseconds (191-183).  Which means that the
  identical heavy code that has been added was running for about 168
  milliseconds (183-15).  It doesn't mean that the added code itself has
  been running for 168 milliseconds.  It means that it took 168
  milliseconds for this code to be completed in a multi-process
  environment where each process gets a time slice to use the CPU.  The
  more processes are running the more time the process will have to wait
  to get the next time slice when it can use the CPU.
  
  We have the second question answered as well. You can see that when
  the code is not just the I<hello> script, the overhead of the extra
  operations done but the C<Apache::Registry> module, is almost
  insignificant. It's a non zero though, so it depends on your
  requirements, and if another 5-10 millisecons overhead are quite
  tolerable, you may choose to use C<Apache::Registry>.
  
  The interesting thing is that when the server under test runs on a
  very slow machine the results are completely different. I'll present
  them here for comparison:
  
    ------------------------------
        name        | avtime   rps
    ------------------------------
    light handler   |     50   196
    light registry  |    160    61
    ------------------------------
    heavy handler   |    149    67
    heavy registry  |    822    12
    ------------------------------
  
  First of all the difference of 6 milliseconds in the average
  processing time we have seen on the fast machine when running the
  I<light> set, now has grown to 110 milliseconds. Which means that a
  few extra operations, that C<Apache::Registry> does, turn to be very
  expensive on the slow machine.
  
  Second, you can see that when the I<heavy> set is used, there is no
  preservation of the 110 milliseconds as we have seen on the fast
  machine, which we obviously would expect to see, since the code that
  was added should take the same time to execute in the handler and the
  script. But instead we see a difference of 673 milliseconds (822-149).
  
  The explanation lies in fact that the difference between the machines
  isn't merely in the CPU speed. It's possible that there are many other
  things that are different. For example the size of the processor
  cache. If one machine has a processor cache large enough to hold the
  whole handler and the other doesn't this can be very significant,
  given that in our I<heavy> benchmark set, 99.9% of the CPU activity
  was dedicated to running the calculation code.
  
  But this also shows you again, that none of the results and conclusion
  made here should be taken for granted. Certainly, most chances are
  that you will see a similar behavior on your machine, but only after
  you have run the benchmarks and analyzed the received results, you can
  be sure what is the best for you using the setup under test. If you
  later you happen to use a different machine, make sure to run the
  tests again, as they can lead to complete different decision as we
  have just seen when we have tried the same benchmark on a different
  machine.
  
  
  =head2 "Bloatware" modules
  
  Perl modules like IO:: are very convenient, but let's see what it
  costs us to use them. (perl5.6.0 over OpenBSD)
  
    % wc `perl -MIO -e 'print join("\n", sort values %INC, "")'`
     124     696    4166 /usr/local/lib/perl5/5.6.0/Carp.pm
     580    2465   17661 /usr/local/lib/perl5/5.6.0/Class/Struct.pm
     400    1495   10455 /usr/local/lib/perl5/5.6.0/Cwd.pm
     313    1589   10377 /usr/local/lib/perl5/5.6.0/Exporter.pm
     225     784    5651 /usr/local/lib/perl5/5.6.0/Exporter/Heavy.pm
      92     339    2813 /usr/local/lib/perl5/5.6.0/File/Spec.pm
     442    1574   10276 /usr/local/lib/perl5/5.6.0/File/Spec/Unix.pm
     115     398    2806 /usr/local/lib/perl5/5.6.0/File/stat.pm
     406    1350   10265 /usr/local/lib/perl5/5.6.0/IO/Socket/INET.pm
     143     429    3075 /usr/local/lib/perl5/5.6.0/IO/Socket/UNIX.pm
    7168   24137  178650 
/usr/local/lib/perl5/5.6.0/OpenBSD.i386-openbsd/Config.pm
     230    1052    5995 
/usr/local/lib/perl5/5.6.0/OpenBSD.i386-openbsd/Errno.pm
     222     725    5216 
/usr/local/lib/perl5/5.6.0/OpenBSD.i386-openbsd/Fcntl.pm
      47     101     669 /usr/local/lib/perl5/5.6.0/OpenBSD.i386-openbsd/IO.pm
     239     769    5005 
/usr/local/lib/perl5/5.6.0/OpenBSD.i386-openbsd/IO/Dir.pm
     169     549    3956 
/usr/local/lib/perl5/5.6.0/OpenBSD.i386-openbsd/IO/File.pm
     594    2180   14772 
/usr/local/lib/perl5/5.6.0/OpenBSD.i386-openbsd/IO/Handle.pm
     252     755    5375 
/usr/local/lib/perl5/5.6.0/OpenBSD.i386-openbsd/IO/Pipe.pm
      77     235    1709 
/usr/local/lib/perl5/5.6.0/OpenBSD.i386-openbsd/IO/Seekable.pm
     428    1419   10219 
/usr/local/lib/perl5/5.6.0/OpenBSD.i386-openbsd/IO/Socket.pm
     452    1401   10554 
/usr/local/lib/perl5/5.6.0/OpenBSD.i386-openbsd/Socket.pm
     127     473    3554 
/usr/local/lib/perl5/5.6.0/OpenBSD.i386-openbsd/XSLoader.pm
      52     161    1050 /usr/local/lib/perl5/5.6.0/SelectSaver.pm
     139     541    3754 /usr/local/lib/perl5/5.6.0/Symbol.pm
     161     609    4081 /usr/local/lib/perl5/5.6.0/Tie/Hash.pm
     109     390    2479 /usr/local/lib/perl5/5.6.0/strict.pm
      79     370    2589 /usr/local/lib/perl5/5.6.0/vars.pm
     318    1124   11975 /usr/local/lib/perl5/5.6.0/warnings.pm
      30      85     722 /usr/local/lib/perl5/5.6.0/warnings/register.pm
   13733   48195  349869 total
  
  Moreover, that requires 116 happy trips through the kernel's namei().
  It syscalls open() a remarkable 57 times, 17 of which failed but
  leaving 38 that were successful.  It also syscalled read() a curiously
  identical 57 times, ingesting a total of 180,265 plump bytes.  To top
  it off, this B<I<increases your resident set size by two megabytes!>>
  
  Happy mallocking...
  
  It seems that C<CGI.pm> suffers from the same disease:
  
    % wc `perl -MCGI -le 'print for values %INC'`
    1368    6920   43710 /usr/local/lib/perl5/5.6.0/overload.pm
    6481   26122  200840 /usr/local/lib/perl5/5.6.0/CGI.pm
    7849   33042  244550 total
  
  You have 16 trips through namei, 7 successful opens, 2 unsuccessful
  ones, and 213k of data read in.
  
  This is a I<perlbloat.pl> that shows how much memory is acquired by
  Perl when you run some. So we can easily test the overhead of loading
  some modules.
  
    #!/usr/bin/perl -w
    
    use GTop ();
    
    my $gtop = GTop->new;
    my $before = $gtop->proc_mem($$)->size;
    
    for (@ARGV) {
        if (eval "require $_") {
            eval {
                $_->import;
            };
        }
        else {
            eval $_;
            die $@ if $@;
        }
    }
    
    my $after = $gtop->proc_mem($$)->size;
    printf "@ARGV added %s\n", GTop::size_string($after - $before);
  
  Now let's try to load C<IO>, which loads C<IO::Handle>,
  C<IO::Seekable>, C<IO::File>, C<IO::Pipe>, C<IO::Socket> and
  C<IO::Dir>:
  
    % ./perlbloat.pl 'use IO;'
    use IO; added  1.5M
  
  I<"Only"> 1.5 MB overhead. Now let's load CGI (v2.74) and compile all
  its methods:
  
    % ./perlbloat.pl 'use CGI; CGI->compile(":all")'
    use CGI; CGI->compile(":all") added  1.8M
  
  Almost 2MB extra memory. Let's compare C<CGI.pm> with its younger
  sister, whose internals are implemented in C.
  
    %. /perlbloat.pl 'use Apache::Request'
    use Apache::Request added   48k
  
  48KB. A significant difference isn't it?
  
  The following numbers show memory sizes in KB (virtual and resident)
  for v5.6.0 of Perl on four different operating systems, The three
  calls each are without any modules, with just -MCGI, and with -MIO
  (never with both):
  
                OpenBSD       FreeBSD    Redhat Linux    Solaris
                vsz   rss     vsz  rss     vsz  rss    vsz    rss
    Raw Perl    736   772     832 1208    2412  980    2928  2272
    w/ CGI     1220  1464    1308 1828    2972 1768    3616  3232
    w/ IO      2292  2580    2456 3016    4080 2868    5384  4976
  
  Anybody who's thinking of choosing one of these might do well to
  digest these numbers first.
  
  
  =head2 Apache::args vs. Apache::Request::param vs. CGI::param
  
  Let's write three C<Apache::Registry> scripts that use
  C<Apache::args>, C<Apache::Request::param> and C<CGI::param> to
  process the form's input and print it out. Notice that C<Apache::args>
  is considered identical to C<Apache::Request::param> only when you
  have a single valued keys, in case of multivalued keys (e.g. when
  using checkbox groups) you will have to write some more code, since if
  you do a simple:
  
    my %params = $r->args;
  
  only the last value will be stored and the rest will collapse,
  something that you will solve with C<Apache::Request::params> as:
  
    my @values = $q->params('key');
  
  In addition C<Apache::Request> and C<CGI.pm> has many more functions
  that ease input processing, like handling file uploads. But
  C<Apache::Request> is much faster since its guts are implemented in C,
  glued with Perl using the XS code.
  
  Therefore assuming that the only functionality that you need is the
  parsing of the key-value pairs, and assuming that every key has a
  single value, we will compare the following almost identical scripts,
  by trying to pass various query strings.
  
  The code that we have used:
  
    processing_with_apache_args.pl
    ------------------------------
    use strict;
    my $r = shift;
    $r->send_http_header('text/plain');
    my %args = $r->args;
    print join "\n", map {"$_ => ".$args{$_} } keys %args;
  
    processing_with_apache_request.pl
    ---------------------------------
    use strict;
    use Apache::Request ();
    my $r = shift;
    my $q = Apache::Request->new($r);
    $r->send_http_header('text/plain');
    my %args = map {$_ => $q->param($_) } $q->param;
    print join "\n", map {"$_ => ".$args{$_} } keys %args;
  
    processing_with_cgi_pm.pl
    ---------------------------------
    use strict;
    use CGI;
    my $r = shift;
    $r->send_http_header('text/plain');
    my $q = new CGI;
    my %args = map {$_ => $q->param($_) } $q->param;
    print join "\n", map {"$_ => ".$args{$_} } keys %args;
  
  All three scripts were preloaded at the server startup:
  
    <Perl>
    use Apache::RegistryLoader ();
    Apache::RegistryLoader->new->handler(
                              "/perl/processing_with_cgi_pm.pl",
              "[ROOT_DIR]/httpd/perl/processing_with_cgi_pm.pl"
                                    );
    Apache::RegistryLoader->new->handler(
                              "/perl/processing_with_apache_request.pl",
              "[ROOT_DIR]/httpd/perl/processing_with_apache_request.pl"
                                    );
    Apache::RegistryLoader->new->handler(
                              "/perl/processing_with_apache_args.pl",
              "[ROOT_DIR]/httpd/perl/processing_with_apache_args.pl"
                                    );
    </Perl>
  
  And the results:
  
    -------------------------------------------------------------
    name           query_length  | avtime completed failed    rps
    -------------------------------------------------------------
    apache_args              25  |     69      5000      0    698
    apache_request           25  |     76      5000      0    632
    apache_args             337  |     97      5000      0    500
    cgi_pm                   25  |    115      5000      0    422
    apache_request          337  |    159      5000      0    308
    cgi_pm                  337  |    301      5000      0    163
    --------------------------------------------------------------
    Non-varying sub-test parameters:
    --------------------------------------------------------------
    concurrency : 50
    connections : 5000
  
  We have used two different query strings, generated by:
  
    my $query = [
               join("&", map {"$_=".'e' x 10}  ('a'..'b')),
               join("&", map {"$_=".'e' x 10}  ('a'..'z')),
              ];
  
  The first one renders into:
  
    a=eeeeeeeeee&b=eeeeeeeeee
  
  which is 25 characters in length. The other is similar but of 337
  characters in length. Now you can tell what are the numbers in the
  C<query_length> column of the report.
  
  You can see that C<Apache::args> is much faster than the other two
  modules, whereas C<Apache::Request::param> is much faster than
  C<CGI::param>.
  
  
  
  
  =head2 Using $|=1 Under mod_perl and Better print() Techniques.
  
  As you know, C<local $|=1;> disables the buffering of the currently
  selected file handle (default is C<STDOUT>).  If you enable it,
  C<ap_rflush()> is called after each C<print()>, unbuffering Apache's
  IO.
  
  If you are using multiple C<print()> calls (_bad_ style in generating
  output) or if you just have too many of them, then you will experience
  a degradation in performance.  The severity depends on the number of
  print() calls that you make.
  
  Many old CGI scripts were written like this:
  
    print "<BODY BGCOLOR=\"black\" TEXT=\"white\">";
    print "<H1>";
    print "Hello";
    print "</H1>";
    print "<A HREF=\"foo.html\"> foo </A>";
    print "</BODY>";
  
  This example has multiple C<print()> calls, which will cause
  performance degradation with C<$|=1>.  It also uses too many
  backslashes.  This makes the code less readable, and it is also more
  difficult to format the HTML so that it is easily readable as the
  script's output.  The code below solves the problems:
  
    print qq{
      <BODY BGCOLOR="black" TEXT="white">
        <H1>
          Hello
        </H1>
        <A HREF="foo.html"> foo </A>
      </BODY>
    };
  
  I guess you see the difference. Be careful though, when printing a
  C<E<lt>HTMLE<gt>> tag. The correct way is:
  
    print qq{<HTML>
      <HEAD></HEAD>
      <BODY>
    }
  
  
  If you try the following:
  
    print qq{
      <HTML>
      <HEAD></HEAD>
      <BODY>
    }
  
  Some older browsers expect the first characters after the headers and
  empty line to be C<E<lt>HTMLE<gt>> with I<no> spaces before the
  opening left angle-bracket.  If there are any other characters, they
  might not accept the output as HTML and print it as a plain text.
  Even if it works with your browser, it might not work for others.
  
  One other approach is to use `here' documents, e.g.:
  
      print <<EOT;
      <HTML>
      <HEAD></HEAD>
      <BODY>
      EOT
  
  Now let's go back to the C<$|=1> topic.  I still disable buffering,
  for two reasons:
  
  =over 4
  
  =item * I use relatively few C<print()> calls.  I achieve this by
  arranging for my C<print()> statements to print multiline HTML, and
  not one line per C<print()> statement.
  
  =item * I want my users to see the output immediately.  So if I am
  about to produce the results of a DB query which might take some time
  to complete, I want users to get some text while they are waiting.
  This improves the usability of my site.  Ask yourself which you like
  better: getting the output a bit slower, but steadily from the moment
  you've pressed the B<Submit> button, or having to watch the "falling
  stars" for a while and then get the whole output at once, even
  if it's a few milliseconds faster - assuming the browser didn't time
  out during the wait.
  
  =back
  
  An even better solution is to keep buffering enabled, and use a
  Perl API C<rflush()> call to flush the buffers when needed.  This way
  you can place the first part of the page that you are going to send to
  the user in the buffer, and flush it a moment before you are going to do
  some lengthy operation, like a DB query.  So you kill two birds with one
  stone: you show some of the data to the user immediately, so she will
  feel that something is actually happening, and you have no performance
  hit from disabled buffering.
  
    use CGI ();
    my $r = shift;
    my $q = new CGI;
    print $q->header('text/html');
    print $q->start_html;
    print $q->p("Searching...Please wait");
    $r->rflush;
      # imitate a lengthy operation
    for (1..5) {
      sleep 1;
    }
    print $q->p("Done!");
  
  B<Conclusion>: Do not blindly follow suggestions, but think what is best for
  you in each case.
  
  
  =head2 Global vs. Fully Qualified Variables
  
  It's always a good idea to avoid using global variables where it's
  possible.  Some variables must be either global, such as C<@ISA> or
  else fully qualified such as C<@MyModule::ISA>, so that Perl can see
  them from different packages.
  
  A combination of C<strict> and C<vars> pragmas keeps modules clean and
  reduces a bit of noise.  However, the C<vars> pragma also creates
  aliases, as does C<Exporter>, which eat up more memory.  When
  possible, try to use fully qualified names instead of C<use vars>.
  
  For example write:
  
    package MyPackage1;
    use strict;
    use vars; # added only for fair comparison
    @MyPackage1::ISA = qw(CGI);
    $MyPackage1::VERSION = "1.00";
    1;
  
  instead of:
  
    package MyPackage2;
    use strict;
    use vars qw(@ISA $VERSION);
    @ISA = qw(CGI);
    $VERSION = "1.00";
    1;
  
  Note that we have added the C<vars> pragma in the package that doesn't
  use it so the memory comparison will be fair. 
  
  Here are the numbers under Perl version 5.6.0
  
    % perl -MGTop -MMyPackage1 -le 'print GTop->new->proc_mem($$)->size'
      2023424
    % perl -MGTop -MMyPackage2 -le 'print GTop->new->proc_mem($$)->size'
      2031616
  
  We have a difference of 8192 bytes. So every few global variables
  declared with C<vars> pragma add about 8KB overhead.
  
  Note that Perl 5.6.0 introduced a new our() pragma which works like
  my() scope-wise, but declares global variables.
  
    package MyPackage3;
    use strict;
    use vars; # not needed, added only for fair comparison
    our @ISA = qw(CGI);
    our $VERSION = "1.00";
    1;
  
  which uses the same amount of memory as a fully qualified global
  variable:
  
    % perl -MGTop -MMyPackage3 -le 'print GTop->new->proc_mem($$)->size'
    2023424
  
  Imported symbols act just like global variables, they can add up quick:
  
    % perlbloat.pl 'use POSIX ()'
    use POSIX () added  316k
  
    % perlbloat.pl 'use POSIX'
    use POSIX added  696k
  
  That's 380k worth of aliases.  Now let's say 6 different
  C<Apache::Registry> scripts C<'use POSIX;'> for strftime() or some
  other function: 6 * 380k = 2.3Mb
  
  One could save 2.3Mb per single process with C<'use POSIX ();'> and
  using fully qualifying C<POSIX::> function calls.
  
  
  
  
  =head2 Object Methods Calls vs. Function Calls
  
  Which subroutine calling form is more efficient: Object methods or
  functions?
  
  =head3 The Overhead with Light Subroutines
  
  Let's do some benchmarking.  We will start doing it using empty methods,
  which will allow us to measure the real difference in the overhead
  each kind of call introduces.  We will use this code:
  
    bench_call1.pl
    --------------
    package Foo;
    
    use strict;
    use Benchmark;
    
    sub bar { };
    
    timethese(50_000, {
                   method   => sub { Foo->bar()      },
                   function => sub { Foo::bar('Foo');},
                  });
  
  The two calls are equivalent, since both pass the class name as their
  first parameter; I<function> does this explicitly, while I<method> does
  this transparently.
  
  The benchmarking result:
  
    Benchmark: timing 50000 iterations of function, method...
      function:  0 wallclock secs ( 0.80 usr +  0.05 sys =  0.85 CPU)
        method:  1 wallclock secs ( 1.51 usr +  0.08 sys =  1.59 CPU)
  
  We are interested in the 'total CPU times' and not the 'wallclock
  seconds'.  It's possible that the load on the system was different for
  the two tests while benchmarking, so the wallclock times give us no
  useful information.
  
  We see that the I<method> calling type is almost twice as slow as the
  I<function> call, 0.85 CPU compared to 1.59 CPU real execution time.  Why
  does this happen?  Because the difference between functions and
  methods is the time taken to resolve the pointer from the object, to
  find the module it belongs to and then the actual method.  The function
  form has one parameter less to pass, less stack operations, less time
  to get to the guts of the subroutine.
  
  perl5.6+ does better method caching, C<Foo-E<gt>method()> is a
  little bit faster (some constant folding magic), but not
  C<Foo-E<gt>$method()>.  And the improvement does not address the
  C<@ISA> lookup that still happens in either case.
  
  
  
  =head3 The Overhead with Heavy Subroutines
  
  But that doesn't mean that you shouldn't use methods.  Generally your
  functions do something, and the more they do the less significant is
  the time to perform the call, because the calling time is effectively
  fixed and is probably a very small overhead in comparison to the
  execution time of the method or function itself. Therefore the longer
  execution time of the function the smaller the relative overhead of
  the method call.  The next benchmark proves this point:
  
    bench_call2.pl
    --------------
    package Foo;
    
    use strict;
    use Benchmark;
    
    sub bar { 
      my $class = shift;
    
      my ($x,$y) = (100,100);
      $y = log ($x ** 10)  for (0..20);
    };
    
    timethese(50_000, {
                   method   => sub { Foo->bar()      },
                   function => sub { Foo::bar('Foo');},
                  });
  
  
  We get a very close benchmarks!
  
    function: 33 wallclock secs (15.81 usr +  1.12 sys = 16.93 CPU)
      method: 32 wallclock secs (18.02 usr +  1.34 sys = 19.36 CPU)
  
  Let's make the subroutine I<bar> even slower:
  
    sub bar { 
      my $class = shift;
    
      my ($x,$y) = (100,100);
      $y = log ($x ** 10)  for (0..40);
    };
  
  And the result is amazing, the I<method> call convention was faster
  than I<function>:
  
    function: 81 wallclock secs (25.63 usr +  1.84 sys = 27.47 CPU)
      method: 61 wallclock secs (19.69 usr +  1.49 sys = 21.18 CPU)
  
  In case your functions do very little, like the functions that
  generate HTML tags in C<CGI.pm>, the overhead might become a
  significant one. If your goal is speed you might consider using the
  I<function> form, but if you write a big and complicated application,
  it's much better to use the I<method> form, as it will make your code
  easier to develop, maintain and debug, saving programmer time which,
  over the life of a project may turn out to be the most significant
  cost factor.
  
  
  =head3 Are All Methods Slower than Functions?
  
  Some modules' API is misleading, for example C<CGI.pm> allows you to
  execute its subroutines as functions or as methods. As you will see in
  a moment its function form of the calls is slower than the method form
  because it does some voodoo work when the function form call is used.
  
    use CGI;
    my $q = new CGI;
    $q->param('x',5);
    my $x = $q->param('x');
  
  vs
  
    use CGI qw(:standard);
    param('x',5);
    my $x = param('x');
  
  As usual, let's benchmark some very light calls and compare. Ideally
  we would expect the I<methods> to be slower than I<functions> based on
  the previous benchmarks:
  
    bench_call3.pl
    ---------------
    use Benchmark;
    
    use CGI qw(:standard);
    $CGI::NO_DEBUG = 1;
    my $q = new CGI;
    my $x;
    timethese
      (20000, {
        method   => sub {$q->param('x',5); $x = $q->param('x'); },
        function => sub {    param('x',5); $x =     param('x'); },
       });
  
  The benchmark is written is such a way that all the initializations
  are done at the beginning, so that we get as accurate performance
  figures as possible.  Let's do it:
  
    % ./bench_call3.pl
    
    function: 51 wallclock secs (28.16 usr +  2.58 sys = 30.74 CPU)
      method: 39 wallclock secs (21.88 usr +  1.74 sys = 23.62 CPU)
  
  As we can see methods are faster than functions, which seems to be
  wrong.  The explanation lays in the way C<CGI.pm> is implemented.
  C<CGI.pm> uses some I<fancy> tricks to make the same routine act both
  as a I<method> and a plain I<function>.  The overhead of checking
  whether the arguments list looks like a I<method> invocation or not,
  will mask the slight difference in time for the way the function was
  called.
  
  If you are intrigued and want to investigate further by yourself the
  subroutine you want to explore is called I<self_or_default>.  The
  first line of this function short-circuits if you are using the object
  methods, but the whole function is called if you are using the
  functional forms.  Therefore, the functional form should be slightly
  slower than the object form.
  
  
  
  
  
  
  
  
  
  =head2 Imported Symbols and Memory Usage
  
  There is a real memory hit when you import all of the functions into
  your process' memory.  This can significantly enlarge memory
  requirements, particularly when there are many child processes.
  
  In addition to polluting the namespace, when a process imports symbols
  from any module or any script it grows by the size of the space
  allocated for those symbols.  The more you import (e.g. qw(:standard)
  vs qw(:all)) the more memory will be used.  Let's say the overhead is
  of size X.  Now take the number of scripts in which you deploy the
  function method interface, let's call that Y.  Finally let's say that
  you have a number of processes equal to Z.
  
  You will need X*Y*Z size of additional memory, taking X=10k, Y=10,
  Z=30, we get 10k*10*30 = 3Mb!!!  Now you understand the difference.
  
  Let's benchmark C<CGI.pm> using C<GTop.pm>.  First we will try it with
  no exporting at all.
  
    use GTop ();
    use CGI ();
    print GTop->new->proc_mem($$)->size;
  
    1,949,696
  
  Now exporting a few dozens symbols:
  
    use GTop ();
    use CGI qw(:standard);
    print GTop->new->proc_mem($$)->size;
  
    1,966,080
  
  And finally exporting all the symbols (about 130)
  
    use GTop ();
    use CGI qw(:all);
    print GTop->new->proc_mem($$)->size;
  
    1,970,176
  
  Results:
  
    import symbols  size(bytes)  delta(bytes) relative to ()
    --------------------------------------
    ()              1949696             0
    qw(:standard)   1966080         16384
    qw(:all)        1970176         20480
  
  So in my example above X=20k =E<gt> 20K*10*30 = 6Mb. You will need 6Mb
  more when importing all the C<CGI.pm>'s symbols than when you import
  none at all.
  
  
  Generally you use more than one script, run more than one process and
  probably import more symbols from the additional modules that you
  deploy. So the real numbers are much bigger.
  
  The function method is faster in the general case, because of the
  time overhead to resolve the pointer from the object.
  
  If you are looking for performance improvements, you will have to face
  the fact that having to type C<My::Module::my_method> might save you a
  good chunk of memory if the above call must not be called with a
  reference to an object, but even then it can be passed by value.
  
  I strongly endorse L<Apache::Request (libapreq) - Generic Apache
  Request Library|modules/Apache_Request_libapreq_Gen>.  Its core is
  written in C, giving it a significant memory and performance benefit.
  It has all the functionality of C<CGI.pm> except the HTML generation
  functions.
  
  
  
  
  =head2 Interpolation, Concatenation or List
  
  Somewhat overlapping with the previous section we want to revisit the
  various approaches of mungling with strings, and compare the speed of
  using lists of strings compared to interpolation. We will add a string
  concatenation angle as well.
  
  When the strings are small, it almost doesn't matter whether
  interpolation or a list is used. Here is a benchmark:
  
    use Benchmark;
    use Symbol;
    my $fh = gensym;
    open $fh, ">/dev/null" or die;
      
    my($one, $two, $three, $four) = ('a'..'d');
    
    timethese(1_000_000,
        {
         interp => sub {
           print $fh "$one$two$three$four";
         },
         list => sub {
           print $fh $one, $two, $three, $four;
         },
         conc => sub {
           print $fh $one.$two.$three.$four;
         },
        });
  
   Benchmark: timing 1000000 iterations of conc, interp, list...
        conc:  3 wallclock secs ( 3.38 usr +  0.00 sys =  3.38 CPU)
      interp:  3 wallclock secs ( 3.45 usr + -0.01 sys =  3.44 CPU)
        list:  2 wallclock secs ( 2.58 usr +  0.00 sys =  2.58 CPU)
  
  The concatenation technique is very similar to interpolation.  The
  list technique is a little bit faster than interpolation. But when the
  strings are large, lists are significantly faster. We have seen this
  in the previous section and here is another benchmark to increase our
  confidence in our conclusion. This time we use 1000 character long
  strings:
  
    use Benchmark;
    use Symbol;
    my $fh = gensym;
    open $fh, ">/dev/null" or die;
    
    my($one, $two, $three, $four) = map { $_ x 1000 } ('a'..'d');
    
    timethese(500_000,
        {
         interp => sub {
           print $fh "$one$two$three$four";
         },
         list => sub {
           print $fh $one, $two, $three, $four;
         },
         conc => sub {
           print $fh $one.$two.$three.$four;
         },
        });
  
    Benchmark: timing 500000 iterations of interp, list...
        conc:  5 wallclock secs ( 4.47 usr +  0.27 sys =  4.74 CPU)
      interp:  4 wallclock secs ( 4.25 usr +  0.26 sys =  4.51 CPU)
        list:  4 wallclock secs ( 2.87 usr +  0.16 sys =  3.03 CPU)
  
  In this case using a list is about 30% faster than
  interpolation. Concatenation is a little bit slower than
  interpolation.
  
  Let's look at this code:
  
      $title = 'My Web Page';
      print "<h1>$title</h1>";         # Interpolation (slow)
      print '<h1>' . $title . '</h1>'; # Concatenation (slow)
      print '<h1>', $title, '</h1>';   # List (fast for long strings)
  
  When you use I<"E<lt>h1E<gt>$titleE<lt>/h1E<gt>"> Perl does
  interpolation (since C<""> is an operator in Perl), which must parse
  the contents of the string and replace any variables or expressions it
  finds with their respective values. This uses more memory and is
  slower than using a list. Of course if there are no variables to
  interpolate it makes no difference whether to use C<"string"> or
  C<'string'>.
  
  Concatenation is also potentially slow since Perl might create a
  temporary string which it then prints.
  
  Lists are fast because Perl can simply deal with each element in
  turn. This is true if you don't run join() on the list at the end to
  create a single string from the elements of list. This operation might
  be slower than direct append to the string whenever a new string
  springs into existence.
  
  
  
  [ReaderMETA]: Please send more mod_perl relevant Perl performance
  hints
  
  
  
  
  
  =head2 Using Perl stat() Call's Cached Results
  
  When you do a stat() (or its variations C<-M> -- last modification
  time, C<-A> -- last access time, C<-C> -- last inode-change time,
  etc), the returned information is cached internally.  If you need to
  make an additional check for the same file, use the C<_> magic
  variable and save the overhead of an unnecessary stat() call.  For
  example when testing for existence and read permissions you might use:
  
    my $filename = "./test";
      # three stat() calls
    print "OK\n" if -e $filename and -r $filename;
    my $mod_time = (-M $filename) * 24 * 60 * 60;
    print "$filename was modified $mod_time seconds before startup\n";
  
  or the more efficient:
  
    my $filename = "./test";
      # one stat() call
    print "OK\n" if -e $filename and -r _;
    my $mod_time = (-M _) * 24 * 60 * 60;
    print "$filename was modified $mod_time seconds before startup\n";
  
  Two stat() calls were saved!
  
  
  
  
  
  =head1 Apache::Registry and Derivatives Specific Notes
  
  These are the sections that deal solely with C<Apache::Registry> and
  derived modules, like C<Apache::PerlRun> and C<Apache::RegistryBB>. No
  Perl handlers code is discussed here, so if you don't use these
  modules, feel free to skip this section.
  
  
  =head2 Be Careful with Symbolic Links
  
  As you know C<Apache::Registry> caches the scripts in the packages
  whose names are constructed by scripts' URI.  If you have the same
  script that can be reached by different URIs, which is possible if you
  have used symbolic links, you will get the same script stored twice in
  the memory.
  
  For example:
  
    % ln -s /home/httpd/perl/news/news.pl /home/httpd/perl/news.pl
  
  Now the script can be reached through the both URIs I</news/news.pl>
  and I</news.pl>.  It doesn't really matter until you advertise the two
  URIs, and users reach the same script from both of them.
  
  So let's assume that you have issued the requests to the both URIs:
  
    http://localhost/perl/news/news.pl
    http://localhost/perl/news.pl
  
  To spot the duplication you should use the
  
L<C<Apache::Status>|debug/Apache__Status____Embedded_Interpreter_Status_Information>
 module.
  Amongst other things, it shows all the compiled C<Apache::Registry>
  scripts (using their respective packages):
  
  If you are using the default configuration directives you should
  either use this URI:
  
    http://localhost/perl-status?rgysubs
  
  or just go to the main menu at:
  
    http://localhost/perl-status
  
  And click on C<Compiled Registry Scripts> menu item.
  
  META: we need a screen snapshot here!!!
  
  If you the script was accessed through the URI that was remapped to
  the real file and through the URI that was remapped to the symbolic
  link, you will see the following output:
  
    Apache::ROOT::perl::news::news_2epl
    Apache::ROOT::perl::news_2epl
  
  You should run the server in the single mode, to see it immediately.
  If you test it in the normal mode--it's possible that some child
  processes would show only one entry or none at all, since they might
  not serve the same requests as the others.  For more hints see the
  section "L<Run the server in single
  mode|control/Running_a_Server_in_Single_Proce>".
  
  
  
  =head1 Improving Performance by Prevention
  
  There are two ways to improve performance: one is by tuning to squeeze
  the most out of your hardware and software; and the other is
  preventing certain bad things from happening, like impolite robots
  that crawl your site without pausing between requests, memory
  leakages, getting the memory unshared, making sure that some processes
  won't take up all the CPU etc.
  
  In the following sections we are going to discuss about the tools and
  programming techniques that would help you to keep your service in
  order, even if you are not around.
  
  
  =head2 Memory leakage
  
  Scripts under mod_perl can very easily leak memory!  Global variables
  stay around indefinitely, lexically scoped variables (declared with
  C<my()>) are destroyed when they go out of scope, provided there are
  no references to them from outside that scope.
  
  Perl doesn't return the memory it acquired from the kernel.  It does
  reuse it though!
  
  =head3 Reading In A Whole File
  
    open IN, $file or die $!;
    local $/ = undef; # will read the whole file in
    $content = <IN>;
    close IN;
  
  If your file is 5Mb, the child which served that script will grow by
  exactly that size.  Now if you have 20 children, and all of them will
  serve this CGI, they will consume 20*5M = 100M of RAM in total!  If
  that's the case, try to use other approaches to processing the file,
  if possible.  Try to process a line at a time and print it back to the
  file.  If you need to modify the file itself, use a temporary file.
  When finished, overwrite the source file.  Make sure you use a
  locking mechanism!
  
  =head3 Copying Variables Between Functions
  
  Now let's talk about passing variables by value.  Let's use the
  example above, assuming we have no choice but to read the whole file
  before any data processing takes place.  Now you have some imaginary
  C<process()> subroutine that processes the data and returns it.  What
  happens if you pass the C<$content> by value?  You have just copied
  another 5M and the child has grown in size by B<another> 5M.  Watch
  your swap space!  Now multiply it again by factor of 20 you have 200M
  of wasted RAM, which will apparently be reused, but it's a waste!
  Whenever you think the variable can grow bigger than a few Kb, pass it
  by reference!
  
  Once I wrote a script that passed the contents of a little flat file
  database to a function that processed it by value -- it worked and it
  was fast, but after a time the database became bigger, so passing it
  by value was expensive.  I had to make the decision whether to buy
  more memory or to rewrite the code.  It's obvious that adding more
  memory will be merely a temporary solution.  So it's better to plan
  ahead and pass variables by reference, if a variable you are going to
  pass might eventually become bigger than you envisage at the time you
  code the program.  There are a few approaches you can use to pass and
  use variables passed by reference. For example:
  
    my $content = qq{foobarfoobar};
    process(\$content);
    sub process{
      my $r_var = shift; 
      $$r_var =~ s/foo/bar/gs;
        # nothing returned - the variable $content outside has already 
        # been modified
    }
  
  
  If you work with arrays or hashes it's:
  
    @{$var_lr}  dereferences an array
    %{$var_hr}  dereferences a hash
  
  We can still access individual elements of arrays and hashes that we
  have a reference to without dereferencing them:
  
    $var_lr->[$index]  get $index'th element of an array via a ref
    $var_hr->{$key}    get $key'th element of a hash via a ref
  
  For more information see C<perldoc perlref>.
  
  Another approach would be to use the C<@_> array directly.  This has
  the effect of passing by reference:
  
    process($content);
    sub process{
      $_[0] =~ s/foo/bar/gs;
        # nothing returned - the variable $content outside has been
        # already modified
    }
  
  From C<perldoc perlsub>:
  
        The array @_ is a local array, but its elements are aliases for
        the actual scalar parameters.  In particular, if an element
        $_[0] is updated, the corresponding argument is updated (or an
        error occurs if it is not possible to update)...
  
  Be careful when you write this kind of subroutine, since it can
  confuse a potential user.  It's not obvious that call like
  C<process($content);> modifies the passed variable.  Programmers (the
  users of your library in this case) are used to subroutines that
  either modify variables passed by reference or expressly return a
  result (e.g. C<$content=process($content);>).
  
  
  =head3 Work With Databases
  
  If you do some DB processing, you will often encounter the need to
  read lots of records into your program, and then print them to the
  browser after they are formatted.  I won't even mention the horrible
  case where programmers read in the whole DB and then use Perl to
  process it!!!  Use a relational DB and let the SQL do the job, so you
  get only the records you need!
  
  We will use C<DBI> for this (assume that we are already connected to
  the DB--refer to C<perldoc DBI> for a complete reference to the C<DBI>
  module):
  
    $sth->execute;
    while(@row_ary  = $sth->fetchrow_array) {
        # do DB accumulation into some variable
    }
    # print the output using the data returned from the DB
  
  In the example above the httpd_process will grow by the size of the
  variables that have been allocated for the records that matched the
  query.  Again remember to multiply it by the number of the children
  your server runs!
  
  A better approach is not to accumulate the records, but rather to
  print them as they are fetched from the DB.  Moreover, we will use the
  C<bind_col()> and C<$sth-E<gt>fetchrow_arrayref()> (aliased to
  C<$sth-E<gt>fetch()>) methods, to fetch the data in the fastest
  possible way.  The example below prints an HTML table with matched
  data, the only memory that is being used is a C<@cols> array to hold
  temporary row values. The table will be rendered by the client browser
  only when the whole table will be out though.
  
    my @select_fields = qw(a b c);
        # create a list of cols values
    my @cols = ();
    @cols[0..$#select_fields] = ();
    $sth = $dbh->prepare($do_sql);
    $sth->execute;
      # Bind perl variables to columns.
    $sth->bind_columns(undef,\(@cols));
    print "<TABLE>";
    while($sth->fetch) {
       print "<TR>",
             map("<TD>$_</TD>", @cols),
             "</TR>";
    }
    print "</TABLE>";
  
  Note: the above method doesn't allow you to know how many records have
  been matched.  The workaround is to run an identical query before the
  code above where you use C<SELECT count(*) ...> instead of C<'SELECT *
  ...>, to get the number of matched records.  It should be much faster,
  since you can remove any B<SORTBY> and similar attributes.
  
  For those who think that B<$sth-E<gt>rows> will do the job, here is
  the quote from the C<DBI> manpage:
  
    rows();
  
    $rv = $sth->rows;
  
    Returns the number of rows affected by the last database altering
    command, or -1 if not known or not available.  Generally you can
    only rely on a row count after a do or non-select execute (for some
    specific operations like update and delete) or after fetching all
    the rows of a select statement.
  
    For select statements it is generally not possible to know how many
    rows will be returned except by fetching them all.  Some drivers
    will return the number of rows the application has fetched so far
    but others may return -1 until all rows have been fetched. So use of
    the rows method with select statements is not recommended.
  
  As a bonus, I wanted to write a single sub that flexibly processes any
  query.  It would accept conditions, a call-back closure sub, select
  fields and restrictions.
  
    # Usage:
    # $o->dump(\%conditions,\&callback_closure,[EMAIL PROTECTED],@restrictions);
    #
    sub dump{
      my $self = shift;
      my %param = %{+shift}; # dereference hash
      my $rsub = shift;
      my @select_fields = @{+shift}; # dereference list
      my @restrict = shift || '';
    
        # create a list of cols values
      my @cols = ();
      @cols[0..$#select_fields] = ();
    
      my $do_sql = '';
      my @where = ();
    
        # make a @where list 
      map { push @where, "$_=\'$param{$_}\'" if $param{$_};} keys %param;
    
        # prepare the sql statement
      $do_sql = "SELECT ";
      $do_sql .= join(" ", @restrict) if @restrict;    # append restriction list
      $do_sql .= " " .join(",", @select_fields) ;      # append select list 
      $do_sql .= " FROM $DBConfig{TABLE} ";            # from table
    
        # we will not add the WHERE clause if @where is empty
      $do_sql .= " WHERE " . join " AND ", @where if @where;
    
      print "SQL: $do_sql \n" if $debug;
    
      $dbh->{RaiseError} = 1;   # do this, or check every call for errors
      $sth = $dbh->prepare($do_sql);
      $sth->execute;
        # Bind perl variables to columns.
      $sth->bind_columns(undef,\(@cols));
      while($sth->fetch) {
        &$rsub(@cols);
      }
        # print the tail or "no records found" message
        # according to the previous calls
      &$rsub();
    
    } # end of sub dump
  
  Now a callback closure sub can do lots of things. We need a closure to
  know what stage are we in: header, body or tail. For example, we want a
  callback closure for formatting the rows to print: 
  
    my $rsub = eval {
        # make a copy of @fields list, since it might go
        # out of scope when this closure is called
      my @fields = @fields; 
      my @query_fields = qw(user dir tool act);   # no date field!!!
      my $header = 0;
      my $tail   = 0;
      my $counter = 0;
      my %cols = ();                     # columns name=> value hash
    
      # Closure with the following behavior:
      # 1. Header's code will be executed on the first call only and
      #    if @_ was set
      # 2. Row's printing code will be executed on every call with @_ set
      # 3. Tail's code will be executed only if Header's code was
      #    printed and @_ isn't set
      # 4. "No record found" code will be executed if Header's code
      #    wasn't executed
    
      sub {
            # Header
        if (@_ and !$header){
          print "<TABLE>\n";
          print $q->Tr(map{ $q->td($_) } @fields );
          $header = 1; 
        }
        
          # Body
        if (@_) {
          print $q->Tr(map{$q->td($_)} @_ );
          $counter++;
          return; 
        }
        
            # Tail, will be printed only at the end
        if ($header and !($tail or @_)){
          print "</TABLE>\n $counter records found";
          $tail = 1;
          return;
        }
        
            # No record found
        unless ($header){
          print $q->p($q->center($q->b("No record was found!\n")));
        }
    
        }        #  end of sub {}
    };  #  end of my $rsub = eval {
  
  
  You might also want to check the section L<Limiting the Size of the
  Processes|performance/Limiting_the_Size_of_the_Process>
  and L<Limiting Other Resources Used by Apache Child
  Processes|performance/Limiting_Other_Resources_Used_by>.
  
  
  =head2 Keeping the Shared Memory Limit
  
  As we have discussed already, during the child process' life a part of
  the memory pages becomes unshared as some data structures become
  I<"dirty"> leading to the increased real memory consuming.  As you
  remember to prevent from the process from growing, it should be killed
  and the newly started process will have all its memory shared with the
  parent process.  While it serves requests the unsharing process
  repeats and it has to be replaced again.
  
  As you remember the C<MaxRequestsPerChild> directive allows you to
  specify the number of requests the server should process before it
  gets killed.  So you have to tune this directive, by finding the
  optimal value using which, the process won't get too much unshared
  memory.  But this is very inconvenient solution since chances are that
  your service is undergoing constant changes and you will have to
  re-tune this number again and again to adapt to the ever changing code
  base.
  
  It would be so nice if we could just set some guardian to watch the
  shared size and kill the process based on the actual shared memory
  usage, when it goes below the specified limit, so it's possible that
  the processes will never be killed if there limit is never passed.
  
  That's where the C<Apache::GTopLimit> module comes to help.  If you
  are lucky to have your OS among those that can build the
  L<libgtop|download/libgtop> library, you will be able to build the
  C<GTop> module that provides the Perl API for C<libgtop>, which in
  turn used by C<Apache::GTopLimit> (that's the I<GTop> part in the
  name).
  
  To set the shared memory lower limit of 4MB using the
  C<Apache::GTopLimit> add the following code into the I<startup.pl>
  file:
  
    use Apache::GTopLimit;
    $Apache::GTopLimit::MIN_PROCESS_SHARED_SIZE = 4096;
  
  and in I<httpd.conf>:
  
    PerlFixupHandler Apache::GTopLimit
  
  and don't forget to restart the server for the changes to take the
  effect.
  
  If you don't want to set this limit by default but only for those
  requests that are likely to get the memory unshared. In this case the
  memory size testing would be done only if you decide that you want
  it. You register the post-processing check by using the
  set_min_shared_size() function. For example:
  
      use Apache::GTopLimit;
      if ($need_to_limit){
        Apache::GTopLimit->set_min_shared_size(4096);
      }
  
  Since accessing the process info might add a little overhead, you may
  want to only check the process size every N times. And that's where
  the C<$Apache::GTopLimit::CHECK_EVERY_N_REQUESTS> variable comes to
  help.  For example to test the size every other time--put in your
  I<startup.pl>:
  
    $Apache::GTopLimit::CHECK_EVERY_N_REQUESTS = 2;
  
  If you want to run this module in the debug mode, add the following
  directive in your I<startup.pl>:
  
    $Apache::GTopLimit::DEBUG = 1;
  
  
  =head2 Limiting the Size of the Processes
  
  So now you know how to prevent processes from consuming more real
  memory when the memory gets unshared. An even more important
  restriction that we want to impose is the absolute size of the
  process. If the process grows after each request, especially if your
  code has memory leaks or you are unfortunate to run an OS with C
  libraries that leak memory, you can easily run out of memory if
  nothing will restrict those processes from growing. The only
  restriction we can impose is killing the processes when they become
  too big.
  
  You can set the C<MaxRequestPerChild> directive to kill the processes
  after only a few requests have been served. But as we have explained
  in the previous section this solution is not as good as the ability to
  control the process size and killing it only when the limit is
  crossed.
  
  If you have the C<Apache::GTopLimit> we have described in the previous
  section you can control the upper limit by setting the
  C<$Apache::GTopLimit::MAX_PROCESS_SIZE> directive.  For example if you
  want the processes to be killed when they are growing bigger than 10MB
  you should set the following limit in the I<startup.pl> file:
  
      $Apache::GTopLimit::MAX_PROCESS_SIZE = 10240;
  
  Just like with the shared memory limiting, you can set the limit for
  the current process using the set_max_size() method in your code:
  
      use Apache::GTopLimit;
      Apache::GTopLimit->set_max_size(10000);
  
  Another alternative is to use the C<Apache::SizeLimit> module, which
  is available for more platforms than C<Apache::GTopLimit> at the
  moment of this writing. You should check the module's manpage to find
  out what they are.
  
  To usage is very similar to C<Apache::GTopLimit>, you control the
  upper size limit by setting the
  C<$Apache::SizeLimit::MAX_PROCESS_SIZE> variable in your I<startup.pl>
  file:
  
    use Apache::SizeLimit;
    $Apache::SizeLimit::MAX_PROCESS_SIZE = 10240; 
  
  And in your I<httpd.conf> you should add:
  
    PerlFixupHandler Apache::SizeLimit
  
  Just like with C<Apache::GTopLimit>, you can test the memory every few
  times, by setting the C<$Apache::SizeLimit::CHECK_EVERY_N_REQUESTS>
  variable. For example every fourth time:
  
    $Apache::SizeLimit::CHECK_EVERY_N_REQUESTS = 4;
  
  And you can set the limit from within your code, rather from the
  global configuration:
  
    use Apache::SizeLimit;
    Apache::SizeLimit->setmax(10240);
  
  =head2 Limiting Other Resources Used by Apache Child Processes
  
  In addition to the absolute and shared memory sizes limiting, you
  might need to prevent the processes from excessive consumption of the
  system resources. Like limiting the CPU usage, the number of files
  that can be opened, or memory segment usage and more.
  
  The C<Apache::Resource> module allows this all by deploying the
  C<BSD::Resource> module, which in turn uses the C function
  C<setrlimit()> to set limits on system resources.
  
  A resource limit is specified as a soft limit and a hard limit.  When
  a soft limit is exceeded a process may receive a signal (for example,
  if the CPU time or file size is exceeded), but it will be allowed to
  continue execution until it reaches the hard limit (or modifies its
  resource limit).  The rlimit structure is used to specify the hard and
  soft limits on a resource. (See the manpage for I<setrlimit> for your
  OS specific information.)
  
  If the value of the variable is of the form C<S:H>, C<S> is treated as
  the soft limit, and C<H> is the hard limit.  If it is just a single
  number, it is used for both soft and hard limits. So if you set
  C<10:20>, the soft limit is 10 and the hard limit is 20. If you set
  just C<10>--both the soft and the hard limits are set to 20.
  
  The mostly spread usage of this module is to limit the CPU usage.  The
  environment variable C<PERL_RLIMIT_CPU> defines the maximum amount of
  CPU time the process can use. If it runs for longer than this, it gets
  killed, no matter what it does, either processing a new request or
  just waiting.  This is very useful when you have a code with a bug and
  the process starts to spin in an infinite loop or alike using a lot of
  CPU and never completing the request. 
  
  META: verify this.
  
  The value is measured in seconds.  The following example sets the soft
  limit of the CPU usage to 120 seconds (the default is 360).
  
    PerlModule Apache::Resource
    PerlSetEnv PERL_RLIMIT_CPU 120
  
  Of course you should tell mod_perl to use this module, which is done
  by adding the following directive to I<httpd.conf>:
  
    PerlChildInitHandler Apache::Resource
  
  There are other resources that you might want to limit. For example
  you can limit the memory data and stack segment sizes
  (C<PERL_RLIMIT_DATA> and C<PERL_RLIMIT_STACK>), the maximum process
  file size (C<PERL_RLIMIT_FSIZE>), the core file size
  (C<PERL_RLIMIT_CORE>), the address space (virtual memory) limit
  (C<PERL_RLIMIT_AS>), etc. Refer to the setrlimit(2) man page on your
  OS for other possible resources. Remember to prepend C<PERL_> before
  the resource types you will see in the man page.
  
  If you configure C<Apache::Status>, it will let you review the
  resources set in this way. Remember that C<Apache::Status> must be
  loaded before C<Apache::Resource> in order to enable the resources
  display menu.
  
  If you want to set the debug mode set the C<$Apache::Resource::Debug>
  before loading the module, for example by using the Perl sections in
  I<httpd.conf>.
  
    <Perl>
      $Apache::Resource::Debug = 1;
      require Apache::Resource;
    </Perl>
    PerlChildInitHandler Apache::Resource
  
  Now open in the I<error_log> file using tell and watch the debug
  messages showing up, when the requests are served.
  
  =head3 OS Specific notes
  
  Note that under Linux malloc() uses mmap() instead of brk().  This is
  done to conserve virtual memory - that is, when you malloc a large
  block of memory, it isn't actually given to your program until you
  initialize it.  The old-style brk() system call obeyed resource limits on
  data segment size as set in setrlimit() - mmap() doesn't.
  
  C<Apache::Resource>'s defaults put caps on data size and stack size.
  Linux's current memory allocation scheme doesn't honor these limits,
  so if you just do
  
    PerlSetEnv PERL_RLIMIT_DEFAULTS On
    PerlModule Apache::Resource
    PerlChildInitHandler Apache::Resource
  
  Your Apache processes are still free to use as much memory as they
  like.
  
  However, C<BSD::Resource> also has a limit called C<RLIMIT_AS>
  (Address Space) which limits the total number of bytes of virtual
  memory assigned to a process. Happily, Linux's memory manager I<does>
  honor this limit.
  
  Therefore, you I<can> limit memory usage under Linux with
  C<Apache::Resource> -- simply add a line to I<httpd.conf>:
  
    PerlSetEnv PERL_RLIMIT_AS  67108864
  
  This example sets a hard and soft limit of 64MB of total address
  space.
  
  Refer to the C<Apache::Resource> and C<setrlimit(2)> manpages for more
  information.
  
  
  
  
  
  
  =head2 Limiting the Number of Processes Serving the Same Resource
  
  If you want to limit number of Apache children that could
  simultaneously be serving the (nearly) same resource, you should take
  a look at the L<C<mod_throttle_access>|download/mod_throttle_access>
  module.
  
  
  It solves the problem of too many concurrent request accessing the
  same URI, if for example the handler that serves this URI uses some
  resource that has a limitation on the maximum number of possible users
  or the handlers code is very CPU intensive and you cannot afford more
  than a certain number of concurrent requests to this specific URI.
  
  Imagine that your service provides the three following URIs:
  
    /perl/news/
    /perl/webmail/
    /perl/morphing/
  
  The first two URIs are response critical as people want to read news
  and their email. The third URI is very CPU and RAM intensive image
  morphing service, provided as a bonus to your users. Since you don't
  want users to abuse this service, you have to set some limits on the
  number of concurrent requests for this resource, since if you
  don't--the other two critical resources can be hurt.
  
  When you compile in and enable the Apache mod_throttle_access module,
  the C<MaxConcurrentReqs> directive becomes available. For example, the
  following setting:
  
    <Location "/perl/morphing">
      <Limit PUT GET POST>
        MaxConcurrentReqs 10
      </Limit>
    </Location> 
  
  will allow only 10 concurrent PUT, GET or POST requests under the URI
  I</perl/morphing> to be processed at one time. The other two URIs
  remain unlimited.
  
  =head2 Limiting the Request Rate Speed (Robot Blocking)
  
  A limitation of using pattern matching to identify robots is that it
  only catches the robots that you know about, and then only those that
  identify themselves by name.  A few devious robots masquerade as users
  by using user agent strings that identify themselves as conventional
  browsers.  To catch such robots, you'll have to be more sophisticated.
  
  C<Apache::SpeedLimit> comes to your aid, see:
  
  http://www.modperl.com/chapters/ch6.html#Blocking_Greedy_Clients
  
  
  
  
  
  
  
  
  
  
  
  
  =head1 Perl Modules for Performance Improvement
  
  
  These sections are about Perl modules that improve performance without
  requiring changes to your code. Mostly you just need to tweak the
  configuration file to plug these modules in.
  
  
  
  =head2 Sending Plain HTML as Compressed Output
  
  See L<Apache::GzipChain - compress HTML (or anything) in the
  OutputChain|modules/Apache_GzipChain_compress_HTM>
  
  
  
  =head2 Caching Components with HTML::Mason
  
  META: complete the full description
  
  C<HTML::Mason> is a system that makes use of components to build HTML
  pages.
  
  If most of your output is generated dynamically, but each finished
  page can be separated into different components, C<HTML::Mason> can
  cache those components.  This can really improve the performance of
  your service and reduce the load on the system.
  
  Say for example that you have a page consisting of five components,
  each generated by a different SQL query, but for four of the five
  components it's the same four queries for each user so you don't have
  to rerun them again and again.  Only one component is generated by a
  unique query and will not use the cache.
  
  META: HTML::Mason docs (v 8.0) said Mason was 2-3 times slower than
  pure mod_perl, implying that the power & convenience made up for this.
  
  META: Should also mention Embperl (especially since its C + XS)
  
  
  
  
  
  =head1 Efficient Work with Databases under mod_perl
  
  
  Most of the mod_perl enabled servers work with database engines, so in
  this section we will learn about two things: how mod_perl makes working
  with databases faster and a few tips for a more efficient DBI coding in
  Perl. (DBI provides an identical Perl interface to many database
  implementations.)
  
  
  =head2 Persistent DB Connections
  
  Another popular use of mod_perl is to take advantage of its ability to
  maintain persistent open database connections.
  
  You want to have a persistent database connection because the most
  expensive part of a network transaction for most databases is the
  business of building and tearing down connections.
  
  Of course the persistence doesn't help with the latency problems
  during the actual use of the database connections.  Oracle is
  notoriously latency-sensitive which in most cases generates a network
  transaction per row returned which slows things down if the query
  execution matches many rows. You may want to read the Tim Bunce's
  Advanced DBI talk at http://www.carumba.com/talk/perl/DBI_Talk3_1999/
  which covers a lot of techniques to reduce latency.
  
  So here is the basic approach of making the connection persistent:
  
    # Apache::Registry script
    -------------------------
    use strict;
    use vars qw($dbh);
    
    $dbh ||= SomeDbPackage->connect(...);
  
  Since C<$dbh> is a global variable for the child, once the child has
  opened the connection it will use it over and over again, unless you
  perform C<disconnect()>.
  
  Be careful to use different names for handlers if you open connections
  to different databases!
  
  C<Apache::DBI> allows you to make a persistent database connection.
  With this module enabled, every C<connect()> request to the plain
  C<DBI> module will be forwarded to the C<Apache::DBI> module.  This
  looks to see whether a database handle from a previous C<connect()>
  request has already been opened, and if this handle is still valid
  using the ping method.  If these two conditions are fulfilled it just
  returns the database handle.  If there is no appropriate database
  handle or if the ping method fails, a new connection is established
  and the handle is stored for later re-use.  B<There is no need to
  delete the C<disconnect()> statements from your code>.  They will not
  do anything, the C<Apache::DBI> module overloads the C<disconnect()>
  method with a NOP.  When a child exits there is no explicit
  disconnect, the child dies and so does the database connection.  You
  may leave the C<use DBI;> statement inside the scripts as well.
  
  The usage is simple -- add to I<httpd.conf>:
  
    PerlModule Apache::DBI
  
  It is important to load this module before any other C<DBI>,
  C<DBD::*> and C<ApacheDBI*> modules!
  
    db.pl
    ------------
    use DBI ();
    use strict;
    
    my $dbh = DBI->connect( 'DBI:mysql:database', 'user', 'password',
                            { autocommit => 0 }
                          ) || die $DBI::errstr;
    
    ...rest of the program
  
  =head3 Preopening Connections at the Child Process' Fork Time
  
  If you use C<DBI> for DB connections, and you use C<Apache::DBI> to
  make them persistent, it also allows you to preopen connections to the
  DB for each child with the C<connect_on_init()> method, thus saving a
  connection overhead on the very first request of every child.
  
    use Apache::DBI ();
    Apache::DBI->connect_on_init("DBI:mysql:test",
                               "login",
                               "passwd",
                               {
                                RaiseError => 1,
                                PrintError => 0,
                                AutoCommit => 1,
                               }
                              );
  
  This is a simple way to have Apache children establish connections on
  server startup.  This call should be in a startup file C<require()d>
  by C<PerlRequire> or inside a E<lt>PerlE<gt> section.  It will
  establish a connection when a child is started in that child process.
  See the C<Apache::DBI> manpage for the requirements for this method.
  
  =head3 Caching prepare() Statements
  
  You can also benefit from persistent connections by replacing
  prepare() with prepare_cached().  That way you will always be sure
  that you have a good statement handle and you will get some caching
  benefit.  The downside is that you are going to pay for DBI to parse
  your SQL and do a cache lookup every time you call prepare_cached().
  
  Be warned that some databases (e.g PostgreSQL and Sybase) don't
  support caches of prepared plans.  With Sybase you could open multiple
  connections to achieve the same result, although this is at the risk
  of getting deadlocks depending on what you are trying to do!
  
  
  
  
  
  
  
  =head2 mod_perl Database Performance Improving
  
  =head3 Analysis of the Problem
  
  A common web application architecture is one or more application
  servers which handle requests from client browsers by consulting one
  or more database servers and performing a transform on the data.  When
  an application must consult the database on every request, the
  interaction with the database server becomes the central performance
  issue.  Spending a bit of time optimizing your database access can
  result in significant application performance improvements.  In this
  analysis, a system using Apache, mod_perl, C<DBI>, and Oracle will be
  considered.  The application server uses Apache and mod_perl to
  service client requests, and C<DBI> to communicate with a remote
  Oracle database.
  
  In the course of servicing a typical client request, the application
  server must retrieve some data from the database and execute a stored
  procedure.  There are several steps that need to be performed to complete
  the request:
  
   1: Connect to the database server
   2: Prepare a SQL SELECT statement
   3: Execute the SELECT statement
   4: Retrieve the results of the SELECT statement
   5: Release the SELECT statement handle
   6: Prepare a PL/SQL stored procedure call
   7: Execute the stored procedure
   8: Release the stored procedure statement handle
   9: Commit or rollback
   10: Disconnect from the database server
  
  In this document, an application will be described which achieves
  maximum performance by eliminating some of the steps above and
  optimizing others.
  
  =head3 Optimizing Database Connections
  
  A naive implementation would perform steps 1 through 10 from above on
  every request.  A portion of the source code might look like this:
  
    # ...
    my $dbh = DBI->connect('dbi:Oracle:host', 'user', 'pass')
        || die $DBI::errstr;
    
    my $baz = $r->param('baz');
    
    eval {
        my $sth = $dbh->prepare(qq{
                SELECT foo 
                  FROM bar 
                 WHERE baz = $baz
        });
        $sth->execute;
    
        while (my @row = $sth->fetchrow_array) {
                # do HTML stuff
        }
        
        $sth->finish;
    
        my $sph = $dbh->prepare(qq{
                BEGIN
                        my_procedure(
                                arg_in => $baz
                        );
                END;
        });
        $sph->execute;
        $sph->finish;
        
        $dbh->commit;
    };
    if ($@) {
        $dbh->rollback;
    }
    
    $dbh->disconnect;
    # ...
  
  In practice, such an implementation would have hideous performance
  problems.  The majority of the execution time of this program would
  likely be spent connecting to the database.  An examination shows that
  step 1 is comprised of many smaller steps:
  
   1: Connect to the database server
   1a: Build client-side data structures for an Oracle connection
   1b: Look up the server's alias in a file
   1c: Look up the server's hostname
   1d: Build a socket to the server
   1e: Build server-side data structures for this connection
  
  The naive implementation waits for all of these steps to happen, and
  then throws away the database connection when it is done!  This is
  obviously wasteful, and easily rectified.  The best solution is to
  hoist the database connection step out of the per-request lifecycle so
  that more than one request can use the same database connection.  This
  can be done by connecting to the database server once, and then not
  disconnecting until the Apache child process exits.  The
  C<Apache::DBI> module does this transparently and automatically with
  little effort on the part of the programmer.
  
  C<Apache::DBI> intercepts calls to C<DBI>'s connect and disconnect
  methods and replaces them with its own.  C<Apache::DBI> caches
  database connections when they are first opened, and it ignores
  disconnect commands.  When an application tries to connect to the same
  database, C<Apache::DBI> returns a cached connection, thus saving the
  significant time penalty of repeatedly connecting to the database. You
  will find a full treatment of C<Apache::DBI> at L<Persistent DB
  Connections|performance/Persistent_DB_Connections>
  
  When C<Apache::DBI> is in use, none of the code in the example needs
  to change. The code is upgraded from naive to respectable with the use
  of a simple module!  The first and biggest database performance
  problem is quickly dispensed with.
  
  =head3 Utilizing the Database Server's Cache
  
  Most database servers, including Oracle, utilize a cache to improve
  the performance of recently seen queries.  The cache is keyed on the
  SQL statement.  If a statement is identical to a previously seen
  statement, the execution plan for the previous statement is reused.
  This can be a considerable improvement over building a new statement
  execution plan.
  
  Our respectable implementation from the last section is not making use
  of this caching ability.  It is preparing the statement: 
  
    SELECT foo FROM bar WHERE baz = $baz
  
  The problem is that C<$baz> is being read from an HTML form, and is
  therefore likely to change on every request.  When the database server
  sees this statement, it is going to look like:
  
    SELECT foo FROM bar WHERE baz = 1
  
  and on the next request, the SQL will be:
  
    SELECT foo FROM bar WHERE baz = 42
  
  Since the statements are different, the database server will not be
  able to reuse its execution plan, and will proceed to make another
  one.  This defeats the purpose of the SQL statement cache.
  
  The application server needs to make sure that SQL statements which
  are the same look the same.  The way to achieve this is to use
  placeholders and bound parameters.  The placeholder is a blank in the
  SQL statement, which tells the database server that the value will be
  filled in later.  The bound parameter is the value which is inserted
  into the blank before the statement is executed.
  
  With placeholders, the SQL statement looks like: 
  
    SELECT foo FROM bar WHERE baz = :baz
  
  Regardless of whether C<baz> is 1 or 42, the SQL always looks the
  same, and the database server can reuse its cached execution plan for
  this statement.  This technique has eliminated the execution plan
  generation penalty from the per-request runtime.  The potential
  performance improvement from this optimization could range from modest
  to very significant.
  
  Here is the updated code fragment which employs this optimization:
  
    # ...
    my $dbh = DBI->connect('dbi:Oracle:host', 'user', 'pass')
        || die $DBI::errstr;
    
    my $baz = $r->param('baz');
    
    eval {
        my $sth = $dbh->prepare(qq{
                SELECT foo 
                  FROM bar 
                 WHERE baz = :baz
        });
        $sth->bind_param(':baz', $baz);
        $sth->execute;
    
        while (my @row = $sth->fetchrow_array) {
                # do HTML stuff
        }
        
        $sth->finish;
    
        my $sph = $dbh->prepare(qq{
                BEGIN
                        my_procedure(
                                arg_in => :baz
                        );
                END;
        });
        $sph->bind_param(':baz', $baz);
        $sph->execute;
        $sph->finish;
        
        $dbh->commit;
    };
    if ($@) {
        $dbh->rollback;
    }
    # ...
  
  =head3 Eliminating SQL Statement Parsing
  
  The example program has certainly come a long way and the performance
  is now probably much better than that of the first revision.  However,
  there is still more speed that can be wrung out of this server
  architecture.  The last bottleneck is in SQL statement parsing.  Every
  time C<DBI>'s prepare() method is called, C<DBI> parses the SQL
  command looking for placeholder strings, and does some housekeeping
  work. Worse, a context has to be built on the client and server sides
  of the connection which the database will use to refer to the
  statement. These things take time, and by eliminating these steps the
  time can be saved.
  
  To get rid of the statement handle construction and statement parsing
  penalties, we could use C<DBI>'s prepare_cached() method. This method
  compares the SQL statement to others that have already been executed.
  If there is a match, the cached statement handle is returned.  But the
  application server is still spending time calling an object method
  (very expensive in Perl), and doing a hash lookup.  Both of these
  steps are unnecessary, since the SQL is very likely to be static and
  known at compile time.  The smart programmer can take advantage of
  these two attributes to gain better database performance.  In this
  example, the database statements will be prepared immediately after
  the connection to the database is made, and they will be cached in
  package scalars to eliminate the method call.
  
  What is needed is a routine that will connect to the database and
  prepare the statements.  Since the statements are dependent upon the
  connection, the integrity of the connection needs to be checked before
  using the statements, and a reconnection should be attempted if
  needed.  Since the routine presented here does everything that
  C<Apache::DBI> does, it does not use C<Apache::DBI> and therefore has
  the added benefit of eliminating a cache lookup on the connection.
  
  Here is an example of such a package:
  
    package My::DB;
    
    use strict;
    use DBI ();
    
    sub connect {
        if (defined $My::DB::conn) {
                eval {
                        $My::DB::conn->ping;
                };
                if (!$@) {
                        return $My::DB::conn;
                }
        }
    
        $My::DB::conn = DBI->connect(
                'dbi:Oracle:server', 'user', 'pass', {
                        PrintError => 1,
                        RaiseError => 1,
                        AutoCommit => 0
                }
        ) || die $DBI::errstr; #Assume application handles this
    
        $My::DB::select = $My::DB::conn->prepare(q{
                SELECT foo
                  FROM bar
                 WHERE baz = :baz
        });
        
        $My::DB::procedure = $My::DB::conn->prepare(q{
                BEGIN
                        my_procedure(
                                arg_in => :baz
                        );
                END;
        });
    
        return $My::DB::conn;
    }
    
    1;
  
  Now the example program needs to be modified to use this package.
  
    # ...
    my $dbh = My::DB->connect;
    
    my $baz = $r->param('baz');
    
    eval {
        my $sth = $My::DB::select;
        $sth->bind_param(':baz', $baz);
        $sth->execute;
    
        while (my @row = $sth->fetchrow_array) {
                # do HTML stuff
        }
    
        my $sph = $My::DB::procedure;
        $sph->bind_param(':baz', $baz);
        $sph->execute;
         
        $dbh->commit;
    };
    if ($@) {
        $dbh->rollback;
    }
    # ...
  
  
  Notice that several improvements have been made. Since the statement
  handles have a longer life than the request, there is no need for each
  request to prepare the statement, and no need to call the statement
  handle's finish method. Since C<Apache::DBI> and the prepare_cached()
  method are not used, no cache lookups are needed.
  
  =head3 Conclusion
  
  The number of steps needed to service the request in the example
  system has been reduced significantly.  In addition, the hidden cost
  of building and tearing down statement handles and of creating query
  execution plans is removed.  Compare the new sequence with the
  original:
  
   1: Check connection to database
   2: Bind parameter to SQL SELECT statement
   3: Execute SELECT statement
   4: Fetch rows
   5: Bind parameters to PL/SQL stored procedure
   6: Execute PL/SQL stored procedure
   7: Commit or rollback
  
  It is probably possible to optimize this example even further, but I
  have not tried.  It is very likely that the time could be better spent
  improving your database indexing scheme or web server buffering and
  load balancing.
  
  
  
  
  
  
  
  
  
  
  
  
  
  =head1 Using 3rd Party Applications
  
  It's been said that no one can do everything well, but one can do
  something specific extremely well. This seems to be true for many
  software applications, when you don't try to do everything but instead
  concentrate on something specific you can do it really well.
  
  Based on the above introduction, while the mod_perl server can do many
  many things, there are other applications (or Apache server modules)
  that can do some specific operations faster or do a really great job
  for the mod_perl server by unloading it when doing some operations by
  themselves.
  
  Let's take a look at a few of these.
  
  
  =head2 Proxying the mod_perl Server
  
  Proxy gives you a great performance increase in most cases. It's
  discussed in the section L<Adding a Proxy Server in http Accelerator
  Mode|strategy/Adding_a_Proxy_Server_in_http_Ac>.
  
  
  
  
  
  =head1 Upload and Download of Big Files
  
  You don't want to tie up your precious mod_perl backend server
  children doing something as long and simple as transferring a file,
  especially a big one.  The overhead saved by mod_perl is typically
  under one second, which is an enormous saving for the scripts whose
  run time is under one second.  The user won't really see any important
  performance benefits from mod_perl, since the upload may take up to
  several minutes.
  
  If some particular script's main functionality is the uploading or
  downloading of big files, you probably want it to be executed on a
  plain apache server under mod_cgi (i.e. performing this operation on
  the front-end server, if you use L<a dual-server
  setup|scenario/One_Plain_and_One_mod_perl_enabl>.
  
  This of course assumes that the script requires none of the
  functionality of the mod_perl server, such as custom authentication
  handlers.
  
  
  =head1 Apache/mod_perl Build Options
  
  It's important how you build mod_perl enabled Apache. It influences
  the size of the httpd executable, some irrelevant modules might slow
  the performance.
  
  [ReaderMETA: Any other building time things that influence performance?]
  
  
  =head2 mod_perl Process Size as a Function of Compiled in C Modules and 
mod_perl Features
  
  You might wonder whether it's better to compile in only the required
  modules and mod_perl hooks, or it doesn't really matter. To answer on
  this question lets first make a few compilation and compare the
  results.
  
  So we are going to build mod_perl starting with:
  
    % perl Makefile.PL APACHE_SRC=../apache_x.x.x/src \
           DO_HTTPD=1 USE_APACI=1
  
  and followed by one of these option groups:
  
  =over
  
  =item 1 Default
  
  I<no arguments>
  
  =item 1 Minimum
  
    APACI_ARGS='--disable-module=env, \
                --disable-module=negotiation, \
                --disable-module=status, \
                --disable-module=info, \
                --disable-module=include, \
                --disable-module=autoindex, \
                --disable-module=dir, \
                --disable-module=cgi, \
                --disable-module=asis, \
                --disable-module=imap, \
                --disable-module=userdir, \
                --disable-module=access, \
                --disable-module=auth'
  
  =item 1 Everything
  
    EVERYTHING=1
  
  =item 1 Everything + Debug
  
    EVERYTHING=1 PERL_DEBUG=1
  
  =back
  
  After re-compiling with arguments of each of these groups, we can
  summarize the results:
  
    Build group    httpd size (bytes)  Difference
    ---------------------------------------------
    Minimum              892928         +     0
    Default              994316         +101388
    Everything          1044432         +151504
    Everything+Debug    1162100         +269172
  
  
  Indeed when you strip most of the default things, the server size is
  slimmer.  But the savings are insignificant since you don't multiply
  the added size by the number of child processes if your OS supports
  sharing memory.  The parent processes is a little bigger, but it
  shares these memory pages with its child processes. Of course not
  everything will be shared, if some module you add does some process
  memory modification particular to the process, but the most will.
  
  And of course this was just an example to show the difference is
  size. It doesn't mean that you can everything away, since there will
  be Apache modules and mod_perl options that you won't be able to work
  without.
  
  But as a good system administrator's rule says: I<"Run the absolute
  minimum of the applications. If you don't know or need something,
  disable it">. Following this rule to decide on the required Apache
  components and disabling the unneeded default components, makes you a
  good Apache administrator.
  
  
  
  =head1 Perl Build Options
  
  
  The Perl interpreter lays in the brain of the mod_perl server and if
  we can optimize perl into doing things faster under mod_perl we make
  the whole server faster. Generally, optimizing the Perl interpreter
  means enabling or disabling some command line options. Let's see a few
  important ones.
  
  
  =head2 -DTWO_POT_OPTIMIZE and -DPACK_MALLOC Perl Build Options
  
  Newer Perl versions also have build time options to reduce runtime
  memory consumption. These options might shrink the size of your httpd
  by about 150k -- quite a big number if you remember to multiply it by the
  number of children you use.
  
  The C<-DTWO_POT_OPTIMIZE> macro improves allocations of data with size
  close to a power of two; but this works for big allocations (starting
  with 16K by default).  Such allocations are typical for big hashes and
  special-purpose scripts, especially image processing.
  
  Perl memory allocation is by bucket with sizes close to powers of two.
  Because of these the malloc() overhead may be big, especially for data
  of size exactly a power of two.  If C<PACK_MALLOC> is defined, perl
  uses a slightly different algorithm for small allocations (up to 64
  bytes long), which makes it possible to have overhead down to 1 byte
  for allocations which are powers of two (and appear quite often).
  
  Expected memory savings (with 8-byte alignment in C<alignbytes>) is
  about 20% for typical Perl usage.  Expected slowdown due to additional
  malloc() overhead is in fractions of a percent and hard to measure,
  because of the effect of saved memory on speed.
  
  You will find these and other memory improvement details in
  C<perl5004delta.pod>.
  
  Important: both options are On by default in perl versions 5.005 and
  higher.
  
  
  
  
  =head2 -Dusemymalloc Perl Build Option
  
  You have a choice to use the native or Perl's own malloc()
  implementation. The choice depends on your Operating System. Unless
  you know which of the two is better on yours, you better try both and
  compare the benchmarks.
  
  To build without Perl's malloc(), you can use the Configure command:
  
    % sh Configure -Uusemymalloc"
  
  Note that:
  
    -U == undefine usemymalloc (use system malloc)
    -D == define   usemymalloc (use Perl's malloc)
  
  It seems that Linux still defaults to system malloc so you might want
  to configure Perl with -Dusemymalloc. Perl's malloc is not 
  much of a win under linux, but makes a B<huge> difference under
  Solaris.
  
  
  =head1 Architecture Specific Compile Options
  
  When you build Apache and Perl you can optimize the compiled
  applications to take the benefits of your machine's architecture.
  
  Everything depends on the kind of compiler that you use, the kind of
  CPU and 
  
  For example if you use gcc(1) you might want to use:
  
  =over
  
  =item *
  
  I<-march=pentium> if you have a pentium CPU
  
  =item *
  
  I<-march=pentiumpro> for pentiumpro and above (but the binary won't
  run on i386)
  
  =item *
  
  I<-fomit-frame-pointer> makes extra register available but disables
  debugging
  
  =item *
  
  you can try these options were reported to improve the performance:
  I<-ffast-math>, I<-malign-double>, I<-funroll-all-loops>,
  I<-fno-rtti>, I<-fno-exceptions>.
  
  see the gcc(1) manpage for the details about these
  
  =item *
  
  and of course you may want to change the usually default C<-02> flag
  with a higher number like I<-O3>. I<-OX> (where X is a number between
  1 and 6) defines a collection of various optimization flags, the
  higher the number the more flags are bundled. The gcc man page will
  tell you what flags are used for each number.
  
  =back
  
  Test your applications thoroughly when you change the default
  optimization flags, especially when you go beyond C<-02>.  It's
  possible that the optimization will make the code work incorrectly
  and/or cause segmentation faults.
  
  See your preferred compiler's man page for detailed information about
  optimization.
  
  =cut
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/perl.pod
  
  Index: perl.pod
  ===================================================================
  =head1 NAME
  
  Perl Reference
  
  =head1 A Must Read!
  
  This document was born because some users are reluctant to learn Perl,
  prior to jumping into mod_perl.  I will try to cover some of the most
  frequent pure Perl questions being asked at the list.
  
  Before you decide to skip this chapter make sure you know all the
  information provided here.  The rest of the Guide assumes that you
  have read this chapter and understood it.
  
  
  
  
  
  
  
  =head1 perldoc's Rarely Known But Very Useful Options
  
  First of all, I want to stress that you cannot become a Perl hacker
  without knowing how to read Perl documentation and search through it.
  Books are good, but an easily accessible and searchable Perl reference
  at your fingertips is a great time saver. It always has the up-to-date
  information for the version of perl you're using.
  
  Of course you can use online Perl documentation at the Web. The two
  major sites are http://www.perldoc.com and
  http://theoryx5.uwinnipeg.ca/CPAN/perl/.
  
  The C<perldoc> utility provides you with access to the documentation
  installed on your system.  To find out what Perl manpages are
  available execute:
  
  
    % perldoc perl
  
  To find what functions perl has, execute:
  
    % perldoc perlfunc
  
  To learn the syntax and to find examples of a specific function, you
  would execute (e.g. for C<open()>):
  
    % perldoc -f open
  
  Note: In perl5.005_03 and earlier, there is a bug in this and the C<-q> 
  options of C<perldoc>.  It won't call C<pod2man>, but will display the 
  section in POD format instead.  Despite this bug it's still readable 
  and very useful. 
  
  The Perl FAQ (I<perlfaq> manpage) is in several sections.  To search
  through the sections for C<open> you would execute:
  
    % perldoc -q open
  
  This will show you all the matching Question and Answer sections,
  still in POD format.
  
  To read the I<perldoc> manpage you would execute:
  
    % perldoc perldoc
  
  
  
  
  
  
  =head1 Tracing Warnings Reports
  
  Sometimes it's very hard to understand what a warning is complaining
  about.  You see the source code, but you cannot understand why some
  specific snippet produces that warning.  The mystery often results
  from the fact that the code can be called from different places if
  it's located inside a subroutine.
  
  Here is an example:
  
    warnings.pl
    -----------
    #!/usr/bin/perl -w
    
    use strict;
    
    correct();
    incorrect();
    
    sub correct{
      print_value("Perl");
    }
    
    sub incorrect{
      print_value();
    }
    
    sub print_value{
      my $var = shift;
      print "My value is $var\n";
    }
  
  In the code above, print_value() prints the passed value.  Subroutine
  correct() passes the value to print, but in subroutine incorrect() we
  forgot to pass it. When we run the script:
  
    % ./warnings.pl
  
  we get the warning:
  
    Use of uninitialized value at ./warnings.pl line 16.
  
  Perl complains about an undefined variable C<$var> at the line that
  attempts to print its value:
  
    print "My value is $var\n";
  
  But how do we know why it is undefined? The reason here obviously is
  that the calling function didn't pass the argument. But how do we know
  who was the caller? In our example there are two possible callers, in
  the general case there can be many of them, perhaps located in other
  files.
  
  We can use the caller() function, which tells who has called us, but
  even that might not be enough: it's possible to have a longer sequence
  of called subroutines, and not just two. For example, here it is sub
  third() which is at fault, and putting sub caller() in sub second()
  would not help us very much:
  
    sub third{
      second();
    }
    sub second{
      my $var = shift;
      first($var);
    }
    sub first{
      my $var = shift;
     print "Var = $var\n"
    }
  
  The solution is quite simple. What we need is a full calls stack trace
  to the call that triggered the warning.
  
  The C<Carp> module comes to our aid with its cluck() function. Let's
  modify the script by adding a couple of lines.  The rest of the script
  is unchanged.
  
    warnings2.pl
    -----------
    #!/usr/bin/perl -w
    
    use strict;
    use Carp ();
    local $SIG{__WARN__} = \&Carp::cluck;
    
    correct();
    incorrect();
    
    sub correct{
      print_value("Perl");
    }
    
    sub incorrect{
      print_value();
    }
    
    sub print_value{
      my $var = shift;
      print "My value is $var\n";
    }
  
  Now when we execute it, we see:
  
    Use of uninitialized value at ./warnings2.pl line 19.
      main::print_value() called at ./warnings2.pl line 14
      main::incorrect() called at ./warnings2.pl line 7
  
  Take a moment to understand the calls stack trace. The deepest calls
  are printed first. So the second line tells us that the warning was
  triggered in print_value(); the third, that print_value() was
  called by subroutine, incorrect().
  
    script => incorrect() => print_value()
  
  We go into C<incorrect()> and indeed see that we forgot to pass the
  variable. Of course when you write a subroutine like C<print_value> it
  would be a good idea to check the passed arguments before starting
  execution. We omitted that step to contrive an easily debugged example.
  
  Sure, you say, I could find that problem by simple inspection of the
  code! 
  
  Well, you're right. But I promise you that your task would be quite
  complicated and time consuming if your code has some thousands of
  lines.  In addition, under mod_perl, certain uses of the C<eval>
  operator and "here documents" are known to throw off Perl's line
  numbering, so the messages reporting warnings and errors can have
  incorrect line numbers. (See L<Finding the Line Which Triggered the
  Error or Warning|debug/Finding_the_Line_Which_Triggered> for more
  information).
  
  Getting the trace helps a lot.
  
  
  
  
  
  
  =head1 Variables Globally, Lexically Scoped And Fully Qualified
  
  META: this material is new and requires polishing so read with care.
  
  You will hear a lot about namespaces, symbol tables and lexical
  scoping in Perl discussions, but little of it will make any sense
  without a few key facts:
  
  
  =head2 Symbols, Symbol Tables and Packages; Typeglobs
  
  There are two important types of symbol: package global and lexical.
  We will talk about lexical symbols later, for now we will talk only
  about package global symbols, which we will refer to simply as
  I<global symbols>.
  
  The names of pieces of your code (subroutine names) and the names of
  your global variables are symbols.  Global symbols reside in one
  symbol table or another.  The code itself and the data do not; the
  symbols are the names of pointers which point (indirectly) to the
  memory areas which contain the code and data. (Note for C/C++
  programmers: we use the term `pointer' in a general sense of one piece
  of data referring to another piece of data not in a specific sense as
  used in C or C++.)
  
  There is one symbol table for each package, (which is why I<global
  symbols> are really I<package global symbols>).
  
  You are always working in one package or another.
  
  Like in C, where the first function you write must be called main(),
  the first statement of your first Perl script is in package C<main::>
  which is the default package.  Unless you say otherwise by using the
  C<package> statement, your symbols are all in package C<main::>. You
  should be aware straight away that files and packages are I<not
  related>. You can have any number of packages in a single file; and a
  single package can be in one file or spread over many files. However
  it is very common to have a single package in a single file. To
  declare a package you write:
  
      package mypackagename;
  
  From the following line you are in package C<mypackagename> and any
  symbols you declare reside in that package. When you create a symbol
  (variable, subroutine etc.) Perl uses the name of the package in which
  you are currently working as a prefix to create the fully qualified
  name of the symbol.
  
  When you create a symbol, Perl creates a symbol table entry for that
  symbol in the current package's symbol table (by default
  C<main::>). Each symbol table entry is called a I<typeglob>. Each
  typeglob can hold information on a scalar, an array, a hash, a
  subroutine (code), a filehandle, a directory handle and a format, each
  of which all have the same name.  So you see now that there are two
  indirections for a global variable: the symbol, (the thing's name),
  points to its typeglob and the typeglob for the thing's type (scalar,
  array, etc.)  points to the data. If we had a scalar and an array with
  the same name their name would point to the same typeglob, but for
  each type of data the typeglob points to somewhere different and so
  the scalar's data and the array's data are completely separate and
  independent, they just happen to have the same name.
  
  Most of the time, only one part of a typeglob is used (yes, it's a bit
  wasteful).  You will by now know that you distinguish between them by
  using what the authors of the Camel book call a I<funny character>. So
  if we have a scalar called `C<line>' we would refer to it in code as
  C<$line>, and if we had an array of the same name, that would be
  written, C<@line>. Both would point to the same typeglob (which would
  be called C<*line>), but because of the I<funny character> (also known
  as I<decoration>) perl won't confuse the two. Of course we might
  confuse ourselves, so some programmers don't ever use the same name
  for more than one type of variable.
  
  Every global symbol is in some package's symbol table. To refer to a
  global symbol we could write the I<fully qualified> name,
  e.g. C<$main::line>. If we are in the same package as the symbol we
  can omit the package name, e.g.  C<$line> (unless you use the C<strict>
  pragma and then you will have to predeclare the variable using the
  C<vars> pragma). We can also omit the package name if we have imported
  the symbol into our current package's namespace. If we want to refer
  to a symbol that is in another package and which we haven't imported
  we must use the fully qualified name, e.g. C<$otherpkg::box>.
  
  Most of the time you do not need to use the fully qualified symbol
  name because most of the time you will refer to package variables from
  within the package.  This is very like C++ class variables.  You can
  work entirely within package C<main::> and never even know you are
  using a package, nor that the symbols have package names.  In a way,
  this is a pity because you may fail to learn about packages and they
  are extremely useful.
  
  The exception is when you I<import> the variable from another package.
  This creates an alias for the variable in the I<current> package, so  
  that you can access it without using the fully qualified name.
  
  Whilst global variables are useful for sharing data and are necessary in some
  contexts it is usually wisest to minimize their use and use I<lexical
  variables>, discussed next, instead.
  
  Note that when you create a variable, the low-level business of
  allocating memory to store the information is handled automatically by
  Perl.  The intepreter keeps track of the chunks of memory to which the
  pointers are pointing and takes care of undefining variables. When all
  references to a variable have ceased to exist then the perl garbage
  collector is free to take back the memory used ready for
  recycling. However perl almost never returns back memory it has
  already used to the operating system during the lifetime of the
  process.
  
  =head3 Lexical Variables and Symbols
  
  The symbols for lexical variables (i.e. those declared using the
  keyword C<my>) are the only symbols which do I<not> live in a symbol
  table.  Because of this, they are not available from outside the block
  in which they are declared.  There is no typeglob associated with a
  lexical variable and a lexical variable can refer only to a scalar, an
  array, a hash or a code reference. (Since perl-5.6 it can also refer
  to a file glob).
  
  If you need access to the data from outside the package then you can
  return it from a subroutine, or you can create a global variable
  (i.e. one which has a package prefix) which points or refers to it and
  return that.  The pointer or reference must be global so that you can
  refer to it by a fully qualified name. But just like in C try to avoid
  having global variables. Using OO methods generally solves this
  problem, by providing methods to get and set the desired value within
  the object that can be lexically scoped inside the package and passed
  by reference.
  
  The phrase "lexical variable" is a bit of a misnomer, we are really
  talking about "lexical symbols".  The data can be referenced by a
  global symbol too, and in such cases when the lexical symbol goes out
  of scope the data will still be accessible through the global symbol.
  This is perfectly legitimate and cannot be compared to the terrible
  mistake of taking a pointer to an automatic C variable and returning
  it from a function--when the pointer is dereferenced there will be a
  segmentation fault.  (Note for C/C++ programmers: having a function
  return a pointer to an auto variable is a disaster in C or C++; the
  perl equivalent, returning a reference to a lexical variable created
  in a function is normal and useful.)
  
  =over 
  
  =item *
  
  C<my()> vs. C<use vars>:
  
  With use vars(), you are making an entry in the symbol table, and you
  are telling the compiler that you are going to be referencing that
  entry without an explicit package name.
  
  With my(), NO ENTRY IS PUT IN THE SYMBOL TABLE.  The compiler figures
  out C<at compile time> which my() variables (i.e. lexical variables)
  are the same as each other, and once you hit execute time you cannot
  go looking those variables up in the symbol table.
  
  =item *
  
  C<my()> vs. C<local()>:
  
  local() creates a temporal-limited package-based scalar, array, hash,
  or glob -- when the scope of definition is exited at runtime, the
  previous value (if any) is restored.  References to such a variable
  are *also* global... only the value changes.  (Aside: that is what
  causes variable suicide. :)
  
  my() creates a lexically-limited non-package-based scalar, array, or
  hash -- when the scope of definition is exited at compile-time, the
  variable ceases to be accessible.  Any references to such a variable
  at runtime turn into unique anonymous variables on each scope exit.
  
  =back
  
  
  
  
  =head2 Additional reading references
  
  For more information see: L<Using global variables and sharing them
  between modules/packages|perl/Using_Global_Variables_and_Shari> and an
  article by Mark-Jason Dominus about how Perl handles variables and
  namespaces, and the difference between C<use vars()> and C<my()> -
  http://www.plover.com/~mjd/perl/FAQs/Namespaces.html .
  
  
  
  
  
  
  
  
  
  
  =head1 my() Scoped Variable in Nested Subroutines
  
  Before we proceed let's make the assumption that we want to develop
  the code under the C<strict> pragma. We will use lexically scoped
  variables (with help of the my() operator) whenever it's possible.
  
  
  
  
  
  =head2 The Poison
  
  Let's look at this code:
  
    nested.pl
    -----------
    #!/usr/bin/perl
    
    use strict;
    
    sub print_power_of_2 {
      my $x = shift;
    
      sub power_of_2 {
        return $x ** 2; 
      }
    
      my $result = power_of_2();
      print "$x^2 = $result\n";
    }
    
    print_power_of_2(5);
    print_power_of_2(6);
  
  Don't let the weird subroutine names fool you, the print_power_of_2()
  subroutine should print the square of the number passed to it. Let's
  run the code and see whether it works:
  
    % ./nested.pl
    
    5^2 = 25
    6^2 = 25
  
  Ouch, something is wrong. May be there is a bug in Perl and it doesn't
  work correctly with the number 6? Let's try again using 5 and 7:
  
    print_power_of_2(5);
    print_power_of_2(7);
  
  And run it:
  
    % ./nested.pl
    
    5^2 = 25
    7^2 = 25
  
  Wow, does it works only for 5? How about using 3 and 5:
  
    print_power_of_2(3);
    print_power_of_2(5);
  
  and the result is:
  
    % ./nested.pl
    
    3^2 = 9
    5^2 = 9
  
  Now we start to understand--only the first call to the
  print_power_of_2() function works correctly. Which makes us think that
  our code has some kind of memory for the results of the first
  execution, or it ignores the arguments in subsequent executions.
  
  
  
  
  
  
  =head2 The Diagnosis
  
  Let's follow the guidelines and use the C<-w> flag. Now execute the
  code:
  
    % ./nested.pl
    
    Variable "$x" will not stay shared at ./nested.pl line 9.
    5^2 = 25
    6^2 = 25
  
  We have never seen such a warning message before and we don't quite
  understand what it means. The C<diagnostics> pragma will certainly
  help us. Let's prepend this pragma before the C<strict> pragma in our
  code:
  
    #!/usr/bin/perl -w
    
    use diagnostics;
    use strict;
  
  And execute it:
  
    % ./nested.pl
    
    Variable "$x" will not stay shared at ./nested.pl line 10 (#1)
      
      (W) An inner (nested) named subroutine is referencing a lexical
      variable defined in an outer subroutine.
      
      When the inner subroutine is called, it will probably see the value of
      the outer subroutine's variable as it was before and during the
      *first* call to the outer subroutine; in this case, after the first
      call to the outer subroutine is complete, the inner and outer
      subroutines will no longer share a common value for the variable.  In
      other words, the variable will no longer be shared.
      
      Furthermore, if the outer subroutine is anonymous and references a
      lexical variable outside itself, then the outer and inner subroutines
      will never share the given variable.
      
      This problem can usually be solved by making the inner subroutine
      anonymous, using the sub {} syntax.  When inner anonymous subs that
      reference variables in outer subroutines are called or referenced,
      they are automatically rebound to the current values of such
      variables.
      
    5^2 = 25
    6^2 = 25
  
  Well, now everything is clear. We have the B<inner> subroutine
  power_of_2() and the B<outer> subroutine print_power_of_2() in our
  code.
  
  When the inner power_of_2() subroutine is called for the first time,
  it sees the value of the outer print_power_of_2() subroutine's C<$x>
  variable. On subsequent calls the inner subroutine's C<$x> variable
  won't be updated, no matter what new values are given to C<$x> in the
  outer subroutine.  There are two copies of the C<$x> variable, no
  longer a single one shared by the two routines.
  
  
  
  
  =head2 The Remedy
  
  The C<diagnostics> pragma suggests that the problem can be solved by
  making the inner subroutine anonymous.
  
  An anonymous subroutine can act as a I<closure> with respect to
  lexically scoped variables. Basically this means that if you define a
  subroutine in a particular B<lexical> context at a particular moment,
  then it will run in that same context later, even if called from
  outside that context.  The upshot of this is that when the subroutine
  B<runs>, you get the same copies of the lexically scoped variables
  which were visible when the subroutine was B<defined>.  So you can
  pass arguments to a function when you define it, as well as when you
  invoke it.
  
  Let's rewrite the code to use this technique:
  
    anonymous.pl
    --------------
    #!/usr/bin/perl
    
    use strict;
    
    sub print_power_of_2 {
      my $x = shift;
    
      my $func_ref = sub {
        return $x ** 2;
      };
    
      my $result = &$func_ref();
      print "$x^2 = $result\n";
    }
    
    print_power_of_2(5);
    print_power_of_2(6);
  
  Now C<$func_ref> contains a reference to an anonymous subroutine,
  which we later use when we need to get the power of two.  Since it is
  anonymous, the subroutine will automatically be rebound to the new
  value of the outer scoped variable C<$x>, and the results will now be
  as expected.
  
  Let's verify:
  
    % ./anonymous.pl
    
    5^2 = 25
    6^2 = 36
  
  So we can see that the problem is solved. 
  
  
  
  =head1 Understanding Closures -- the Easy Way
  
  In Perl, a closure is just a subroutine that refers to one or more
  lexical variables declared outside the subroutine itself and must
  therefore create a distinct clone of the environment on the way out.
  
  And both named subroutines and anonymous subroutines can be closures.
  
  Here's how to tell if a subroutine is a closure or not:
  
    for (1..5) {
      push @a, sub { "hi there" };
    }
    for (1..5) {
      {
        my $b;
        push @b, sub { $b."hi there" };
      }
    }
    print "anon normal:\n", join "\t\n",@a,"\n";
    print "anon closure:\n",join "\t\n",@b,"\n";
  
  which generates:
  
    anon normal:
    CODE(0x80568e4)     
    CODE(0x80568e4)     
    CODE(0x80568e4)     
    CODE(0x80568e4)     
    CODE(0x80568e4)     
    
    anon closure:
    CODE(0x804b4c0)     
    CODE(0x8056b54)     
    CODE(0x8056bb4)     
    CODE(0x80594d8)     
    CODE(0x8059538)     
  
  Note how each code reference from the non-closure is identical, but
  the closure form must generate distinct coderefs to point at the
  distinct instances of the closure.
  
  And now the same with named subroutines:
  
    for (1..5) {
      sub a { "hi there" };
      push @a, \&a;
    }
    for (1..5) {
      {
        my $b;
        sub b { $b."hi there" };
        push @b, \&b;
      }
    }
    print "normal:\n", join "\t\n",@a,"\n";
    print "closure:\n",join "\t\n",@b,"\n";
  
  which generates:
  
    anon normal:
    CODE(0x80568c0)     
    CODE(0x80568c0)     
    CODE(0x80568c0)     
    CODE(0x80568c0)     
    CODE(0x80568c0)     
    
    anon closure:
    CODE(0x8056998)     
    CODE(0x8056998)     
    CODE(0x8056998)     
    CODE(0x8056998)     
    CODE(0x8056998)     
  
  We can see that both versions has generated the same code
  reference. For the subroutine I<a> it's easy, since it doesn't include
  any lexical variables defined outside it in the same lexical scope. 
  
  As for the subroutine I<b>, it's indeed a closure, but Perl won't
  recompile it since it's a named subroutine (see the I<perlsub>
  manpage). It's something that we don't want to happen in our code
  unless we want it for this special effect, similar to I<static>
  variables in C.
  
  This is the underpinnings of that famous I<"won't stay shared">
  message.  A I<my> variable in a named subroutine context is generating
  identical code references and therefore it ignores any future changes
  to the lexical variables outside of it.
  
  
  
  
  =head1 When You Cannot Get Rid of The Inner Subroutine
  
  First you might wonder, why in the world will someone need to define
  an inner subroutine? Well, for example to reduce some of Perl's script
  startup overhead you might decide to write a daemon that will compile
  the scripts and modules only once, and cache the pre-compiled code in
  memory. When some script is to be executed, you just tell the daemon
  the name of the script to run and it will do the rest and do it much
  faster since compilation has already taken place.
  
  Seems like an easy task, and it is. The only problem is once the
  script is compiled, how do you execute it? Or let's put it the other
  way: after it was executed for the first time and it stays compiled in
  the daemon's memory, how do you call it again? If you could get all
  developers to code their scripts so each has a subroutine called run()
  that will actually execute the code in the script then we've solved
  half the problem.
  
  But how does the daemon know to refer to some specific script if they
  all run in the C<main::> name space? One solution might be to ask the
  developers to declare a package in each and every script, and for the
  package name to be derived from the script name. However, since there
  is a chance that there will be more than one script with the same name
  but residing in different directories, then in order to prevent
  namespace collisions the directory has to be a part of the package
  name too. And don't forget that the script may be moved from one
  directory to another, so you will have to make sure that the package
  name is corrected every time the script gets moved.
  
  But why enforce these strange rules on developers, when we can arrange
  for our daemon to do this work? For every script that the daemon is
  about to execute for the first time, the script should be wrapped
  inside the package whose name is constructed from the mangled path to
  the script and a subroutine called run(). For example if the daemon is
  about to execute the script I</tmp/hello.pl>:
  
    hello.pl
    --------
    #!/usr/bin/perl
    print "Hello\n";
  
  Prior to running it, the daemon will change the code to be:
  
    wrapped_hello.pl
    ----------------
    package cache::tmp::hello_2epl;
    
    sub run{
      #!/usr/bin/perl 
      print "Hello\n";
    }
  
  The package name is constructed from the prefix C<cache::>, each
  directory separation slash is replaced with C<::>, and non
  alphanumeric characters are encoded so that for example C<.> (a dot)
  becomes C<_2e> (an underscore followed by the ASCII code for a dot in
  hex representation).
  
   % perl -e 'printf "%x",ord(".")'
  
  prints: C<2e>. The underscore is the same you see in URL encoding
  except the C<%> character is used instead (C<%2E>), but since C<%> has
  a special meaning in Perl (prefix of hash variable) it couldn't be
  used.
  
  Now when the daemon is requested to execute the script
  I</tmp/hello.pl>, all it has to do is to build the package name as
  before based on the location of the script and call its run()
  subroutine:
  
    use cache::tmp::hello_2epl;
    cache::tmp::hello_2epl::run();
  
  We have just written a partial prototype of the daemon we wanted. The
  only outstanding problem is how to pass the path to the script to the
  daemon. This detail is left as an exercise for the reader.
  
  If you are familiar with the C<Apache::Registry> module, you know that
  it works in almost the same way. It uses a different package prefix
  and the generic function is called handler() and not run(). The
  scripts to run are passed through the HTTP protocol's headers.
  
  Now you understand that there are cases where your normal subroutines
  can become inner, since if your script was a simple:
  
    simple.pl
    ---------
    #!/usr/bin/perl 
    sub hello { print "Hello" }
    hello();
  
  Wrapped into a run() subroutine it becomes:
  
    simple.pl
    ---------
    package cache::simple_2epl;
    
    sub run{
      #!/usr/bin/perl 
      sub hello { print "Hello" }
      hello();
    }
  
  Therefore, hello() is an inner subroutine and if you have used my()
  scoped variables defined and altered outside and used inside hello(),
  it won't work as you expect starting from the second call, as was
  explained in the previous section.
  
  
  =head2 Remedies for Inner Subroutines
  
  First of all there is nothing to worry about, as long as you don't
  forget to turn the warnings On.  If you do happen to have the 
  "L<my() Scoped Variable in Nested 
  Subroutines|perl/my_Scoped_Variable_in_Nested_S>"
  problem, Perl will always alert you.
  
  
  Given that you have a script that has this problem, what are the ways
  to solve it? There are many of them and we will discuss some of them
  here.
  
  We will use the following code to show the different solutions.
  
    multirun.pl
    -----------
    #!/usr/bin/perl -w
    
    use strict;
    
    for (1..3){
      print "run: [time $_]\n";
      run();
    }
    
    sub run{
    
      my $counter = 0;
    
      increment_counter();
      increment_counter();
    
      sub increment_counter{
        $counter++;
        print "Counter is equal to $counter !\n";
      }
    
    } # end of sub run
  
  This code executes the run() subroutine three times, which in turn
  initializes the C<$counter> variable to 0, every time it is executed
  and then calls the inner subroutine increment_counter() twice. Sub
  increment_counter() prints C<$counter>'s value after incrementing
  it. One might expect to see the following output:
  
    run: [time 1]
    Counter is equal to 1 !
    Counter is equal to 2 !
    run: [time 2]
    Counter is equal to 1 !
    Counter is equal to 2 !
    run: [time 3]
    Counter is equal to 1 !
    Counter is equal to 2 !
  
  But as we have already learned from the previous sections, this is not
  what we are going to see. Indeed, when we run the script we see:
  
    % ./multirun.pl
  
    Variable "$counter" will not stay shared at ./nested.pl line 18.
    run: [time 1]
    Counter is equal to 1 !
    Counter is equal to 2 !
    run: [time 2]
    Counter is equal to 3 !
    Counter is equal to 4 !
    run: [time 3]
    Counter is equal to 5 !
    Counter is equal to 6 !
  
  Obviously, the C<$counter> variable is not reinitialized on each
  execution of run(). It retains its value from the previous execution,
  and sub increment_counter() increments that.
  
  One of the workarounds is to use globally declared variables, with the
  C<vars> pragma.
  
    multirun1.pl
    -----------
    #!/usr/bin/perl -w
    
    use strict;
    use vars qw($counter);
    
    for (1..3){
      print "run: [time $_]\n";
      run();
    }
    
    sub run {
    
      $counter = 0;
    
      increment_counter();
      increment_counter();
    
      sub increment_counter{
        $counter++;
        print "Counter is equal to $counter !\n";
      }
    
    } # end of sub run
  
  If you run this and the other solutions offered below, the expected
  output will be generated:
  
    % ./multirun1.pl
    
    run: [time 1]
    Counter is equal to 1 !
    Counter is equal to 2 !
    run: [time 2]
    Counter is equal to 1 !
    Counter is equal to 2 !
    run: [time 3]
    Counter is equal to 1 !
    Counter is equal to 2 !
  
  By the way, the warning we saw before has gone, and so has the
  problem, since there is no C<my()> (lexically defined) variable used
  in the nested subroutine.
  
  Another approach is to use fully qualified variables. This is better,
  since less memory will be used, but it adds a typing overhead:
  
    multirun2.pl
    -----------
    #!/usr/bin/perl -w
    
    use strict;
    
    for (1..3){
      print "run: [time $_]\n";
      run();
    }
    
    sub run {
    
      $main::counter = 0;
    
      increment_counter();
      increment_counter();
    
      sub increment_counter{
        $main::counter++;
        print "Counter is equal to $main::counter !\n";
      }
    
    } # end of sub run
  
  You can also pass the variable to the subroutine by value and make the
  subroutine return it after it was updated. This adds time and memory
  overheads, so it may not be good idea if the variable can be very
  large, or if speed of execution is an issue.
  
  Don't rely on the fact that the variable is small during the
  development of the application, it can grow quite big in situations
  you don't expect. For example, a very simple HTML form text entry
  field can return a few megabytes of data if one of your users is bored
  and wants to test how good your code is. It's not uncommon to see
  users copy-and-paste 10Mb core dump files into a form's text fields
  and then submit it for your script to process.
  
    multirun3.pl
    -----------
    #!/usr/bin/perl -w
    
    use strict;
    
    for (1..3){
      print "run: [time $_]\n";
      run();
    }
    
    sub run {
    
      my $counter = 0;
    
      $counter = increment_counter($counter);
      $counter = increment_counter($counter);
    
      sub increment_counter{
        my $counter = shift;
    
        $counter++;
        print "Counter is equal to $counter !\n";
    
        return $counter;
      }
    
    } # end of sub run
  
  Finally, you can use references to do the job. The version of
  increment_counter() below accepts a reference to the C<$counter>
  variable and increments its value after first dereferencing it. When
  you use a reference, the variable you use inside the function is
  physically the same bit of memory as the one outside the function.
  This technique is often used to enable a called function to modify
  variables in a calling function.
  
    multirun4.pl
    -----------
    #!/usr/bin/perl -w
    
    use strict;
    
    for (1..3){
      print "run: [time $_]\n";
      run();
    }
    
    sub run {
    
      my $counter = 0;
    
      increment_counter(\$counter);
      increment_counter(\$counter);
    
      sub increment_counter{
        my $r_counter = shift;
    
        $$r_counter++;
        print "Counter is equal to $$r_counter !\n";
      }
    
    } # end of sub run
  
  Here is yet another and more obscure reference usage. We modify the
  value of C<$counter> inside the subroutine by using the fact that
  variables in C<@_> are aliases for the actual scalar parameters. Thus
  if you called a function with two arguments, those would be stored in
  C<$_[0]> and C<$_[1]>. In particular, if an element C<$_[0]> is
  updated, the corresponding argument is updated (or an error occurs if
  it is not updatable as would be the case of calling the function with
  a literal, e.g. I<increment_counter(5)>).
  
  
    multirun5.pl
    -----------
    #!/usr/bin/perl -w
    
    use strict;
    
    for (1..3){
      print "run: [time $_]\n";
      run();
    }
    
    sub run {
    
      my $counter = 0;
    
      increment_counter($counter);
      increment_counter($counter);
    
      sub increment_counter{
        $_[0]++;
        print "Counter is equal to $_[0] !\n";
      }
    
    } # end of sub run
  
  The approach given above should be properly documented of course.
  
  Here is a solution that avoids the problem entirely by splitting the
  code into two files; the first is really just a wrapper and loader,
  the second file contains the heart of the code.
  
    multirun6.pl
    -----------
    #!/usr/bin/perl -w
    
    use strict;
    require 'multirun6-lib.pl' ;
    
    for (1..3){
      print "run: [time $_]\n";
      run();
    }
  
  Separate file:
  
    multirun6-lib.pl
    ----------------
    use strict ;
    
    my $counter;
  
    sub run {
      $counter = 0;
  
      increment_counter();
      increment_counter();
    }
    
    sub increment_counter{
      $counter++;
      print "Counter is equal to $counter !\n";
    }
    
    1 ;
  
  
  Now you have at least six workarounds to choose from.
  
  
  For more information please refer to perlref and perlsub manpages.
  
  
  
  
  
  
  
  
  
  
  
  
  
  =head1 use(), require(), do(), %INC and @INC Explained
  
  
  
  =head2 The @INC array
  
  C<@INC> is a special Perl variable which is the equivalent of the
  shell's C<PATH> variable. Whereas C<PATH> contains a list of
  directories to search for executables, C<@INC> contains a list of
  directories from which Perl modules and libraries can be loaded.
  
  When you use(), require() or do() a filename or a module, Perl gets a
  list of directories from the C<@INC> variable and searches them for
  the file it was requested to load.  If the file that you want to load
  is not located in one of the listed directories, you have to tell Perl
  where to find the file.  You can either provide a path relative to one
  of the directories in C<@INC>, or you can provide the full path to the
  file.
  
  
  =head2 The %INC hash
  
  C<%INC> is another special Perl variable that is used to cache the
  names of the files and the modules that were successfully loaded and
  compiled by use(), require() or do() statements. Before attempting to
  load a file or a module with use() or require(), Perl checks whether
  it's already in the C<%INC> hash. If it's there, the loading and
  therefore the compilation are not performed at all. Otherwise the file
  is loaded into memory and an attempt is made to compile it. do() does
  unconditional loading--no lookup in the C<%INC> hash is made.
  
  If the file is successfully loaded and compiled, a new key-value pair
  is added to C<%INC>. The key is the name of the file or module as it
  was passed to the one of the three functions we have just mentioned,
  and if it was found in any of the C<@INC> directories except C<".">
  the value is the full path to it in the file system.
  
  The following examples will make it easier to understand the logic.
  
  First, let's see what are the contents of C<@INC> on my system:
  
    % perl -e 'print join "\n", @INC'
    /usr/lib/perl5/5.00503/i386-linux
    /usr/lib/perl5/5.00503
    /usr/lib/perl5/site_perl/5.005/i386-linux
    /usr/lib/perl5/site_perl/5.005
    .
  
  Notice the C<.> (current directory) is the last directory in the list.
  
  Now let's load the module C<strict.pm> and see the contents of C<%INC>:
  
    % perl -e 'use strict; print map {"$_ => $INC{$_}\n"} keys %INC'
    
    strict.pm => /usr/lib/perl5/5.00503/strict.pm
  
  Since C<strict.pm> was found in I</usr/lib/perl5/5.00503/> directory
  and I</usr/lib/perl5/5.00503/> is a part of C<@INC>, C<%INC> includes
  the full path as the value for the key C<strict.pm>.
  
  Now let's create the simplest module in C</tmp/test.pm>:
  
    test.pm
    -------
    1;
  
  It does nothing, but returns a true value when loaded. Now let's load
  it in different ways:
  
    % cd /tmp
    % perl -e 'use test; print map {"$_ => $INC{$_}\n"} keys %INC'
    
    test.pm => test.pm
  
  Since the file was found relative to C<.> (the current directory), the
  relative path is inserted as the value. If we alter C<@INC>, by adding
  I</tmp> to the end:
  
    % cd /tmp
    % perl -e 'BEGIN{push @INC, "/tmp"} use test; \
    print map {"$_ => $INC{$_}\n"} keys %INC'
    
    test.pm => test.pm
  
  Here we still get the relative path, since the module was found first
  relative to C<".">. The directory I</tmp> was placed after C<.> in the
  list. If we execute the same code from a different directory, the
  C<"."> directory won't match,
  
    % cd /
    % perl -e 'BEGIN{push @INC, "/tmp"} use test; \
    print map {"$_ => $INC{$_}\n"} keys %INC'
    
    test.pm => /tmp/test.pm
  
  so we get the full path. We can also prepend the path with unshift(),
  so it will be used for matching before C<"."> and therefore we will
  get the full path as well:
  
    % cd /tmp
    % perl -e 'BEGIN{unshift @INC, "/tmp"} use test; \
    print map {"$_ => $INC{$_}\n"} keys %INC'
    
    test.pm => /tmp/test.pm
  
  The code:
  
    BEGIN{unshift @INC, "/tmp"}
  
  can be replaced with the more elegant:
  
    use lib "/tmp";
  
  Which is almost equivalent to our C<BEGIN> block and is the
  recommended approach.
  
  These approaches to modifying C<@INC> can be labor intensive, since
  if you want to move the script around in the file-system you have to
  modify the path. This can be painful, for example, when you move your
  scripts from development to a production server.
  
  There is a module called C<FindBin> which solves this problem in the
  plain Perl world, but unfortunately it won't work under mod_perl,
  since it's a module and as any module it's loaded only once. So the
  first script using it will have all the settings correct, but the rest
  of the scripts will not if located in a different directory from the
  first.
  
  For the sake of completeness, I'll present this module anyway.
  
  If you use this module, you don't need to write a hard coded path. The
  following snippet does all the work for you (the file is
  I</tmp/load.pl>):
  
    load.pl
    -------
    #!/usr/bin/perl
    
    use FindBin ();
    use lib "$FindBin::Bin";
    use test;
    print "test.pm => $INC{'test.pm'}\n";
  
  In the above example C<$FindBin::Bin> is equal to I</tmp>. If we move
  the script somewhere else... e.g. I</tmp/new_dir> in the code above
  C<$FindBin::Bin> equals I</tmp/new_dir>.
  
    % /tmp/load.pl
    
    test.pm => /tmp/test.pm
  
  This is just like C<use lib> except that no hard coded path is
  required.
  
  You can use this workaround to make it work under mod_perl.
  
    do 'FindBin.pm';
    unshift @INC, "$FindBin::Bin";
    require test;
    #maybe test::import( ... ) here if need to import stuff
  
  This has a slight overhead because it will load from disk and
  recompile the C<FindBin> module on each request. So it may not be
  worth it.
  
  
  =head2 Modules, Libraries and Program Files
  
  Before we proceed, let's define what we mean by I<module>, 
  I<library> and I<program file>.
  
  =over
  
  =item * Libraries
  
  These are files which contain Perl subroutines and other code.
  
  When these are used to break up a large program into manageable chunks
  they don't generally include a package declaration; when they are used
  as subroutine libraries they often do have a package declaration.
  
  Their last statement returns true, a simple C<1;> statement ensures
  that.
  
  They can be named in any way desired, but generally their extension is
  I<.pl>.
  
  Examples:
  
    config.pl
    ----------
    # No package so defaults to main::
    $dir = "/home/httpd/cgi-bin";
    $cgi = "/cgi-bin";
    1;
  
    mysubs.pl
    ----------
    # No package so defaults to main::
    sub print_header{
      print "Content-type: text/plain\r\n\r\n";
    }
    1;
  
    web.pl
    ------------
    package web ;
    # Call like this: web::print_with_class('loud',"Don't shout!");
    sub print_with_class{
      my( $class, $text ) = @_ ;
      print qq{<span class="$class">$text</span>};
    }
    1;
  
  
  =item * Modules
  
  A file which contains perl subroutines and other code.
  
  It generally declares a package name at the beginning of it.
  
  Modules are generally used either as function libraries (which I<.pl>
  files are still but less commonly used for), or as object libraries
  where a module is used to define a class and its methods.
  
  Its last statement returns true.
  
  The naming convention requires it to have a I<.pm> extension.
  
  Example:
  
    MyModule.pm
    -----------
    package My::Module;
    $My::Module::VERSION = 0.01;
    
    sub new{ return bless {}, shift;}
    END { print "Quitting\n"}
    1;
  
  =item * Program Files
  
  Many Perl programs exist as a single file. Under Linux and other
  Unix-like operating systems the file often has no suffix since the
  operating system can determine that it is a perl script from the first
  line (shebang line) or if it's Apache that executes the code, there is
  a variety of ways to tell how and when the file should be executed.
  Under Windows a suffix is normally used, for example C<.pl> or
  C<.plx>.
  
  The program file will normally C<require()> any libraries and C<use()>
  any modules it requires for execution.
  
  It will contain Perl code but won't usually have any package names.
  
  Its last statement may return anything or nothing.
  
  =back
  
  
  
  
  
  =head2 require()
  
  require() reads a file containing Perl code and compiles it. Before
  attempting to load the file it looks up the argument in C<%INC> to see
  whether it has already been loaded. If it has, require() just returns
  without doing a thing. Otherwise an attempt will be made to load and
  compile the file.
  
  require() has to find the file it has to load. If the argument is a
  full path to the file, it just tries to read it. For example:
  
    require "/home/httpd/perl/mylibs.pl";
  
  If the path is relative, require() will attempt to search for the file
  in all the directories listed in C<@INC>.  For example:
  
    require "mylibs.pl";
  
  If there is more than one occurrence of the file with the same name in
  the directories listed in C<@INC> the first occurrence will be used.
  
  The file must return I<TRUE> as the last statement to indicate
  successful execution of any initialization code. Since you never know
  what changes the file will go through in the future, you cannot be
  sure that the last statement will always return I<TRUE>. That's why
  the suggestion is to put "C<1;>" at the end of file.
  
  Although you should use the real filename for most files, if the file
  is a L<module|perl/Modules__Libraries_and_Program_Files>, you may use the
  following convention instead:
  
    require My::Module;
  
  This is equal to:
  
    require "My/Module.pm";
  
  If require() fails to load the file, either because it couldn't find
  the file in question or the code failed to compile, or it didn't
  return I<TRUE>, then the program would die().  To prevent this the
  require() statement can be enclosed into an eval() exception-handling
  block, as in this example:
  
    require.pl
    ----------
    #!/usr/bin/perl -w
    
    eval { require "/file/that/does/not/exists"};
    if ($@) {
      print "Failed to load, because : $@"
    }
    print "\nHello\n";
  
  When we execute the program:
  
    % ./require.pl
    
    Failed to load, because : Can't locate /file/that/does/not/exists in
    @INC (@INC contains: /usr/lib/perl5/5.00503/i386-linux
    /usr/lib/perl5/5.00503 /usr/lib/perl5/site_perl/5.005/i386-linux
    /usr/lib/perl5/site_perl/5.005 .) at require.pl line 3.
    
    Hello
  
  We see that the program didn't die(), because I<Hello> was
  printed. This I<trick> is useful when you want to check whether a user
  has some module installed, but if she hasn't it's not critical,
  perhaps the program can run without this module with reduced
  functionality.
  
  If we remove the eval() part and try again:
  
    require.pl
    ----------
    #!/usr/bin/perl -w
    
    require "/file/that/does/not/exists";
    print "\nHello\n";
  
    % ./require1.pl
    
    Can't locate /file/that/does/not/exists in @INC (@INC contains:
    /usr/lib/perl5/5.00503/i386-linux /usr/lib/perl5/5.00503
    /usr/lib/perl5/site_perl/5.005/i386-linux
    /usr/lib/perl5/site_perl/5.005 .) at require1.pl line 3.
  
  The program just die()s in the last example, which is what you want in
  most cases.
  
  For more information refer to the perlfunc manpage.
  
  =head2 use()
  
  use(), just like require(), loads and compiles files containing Perl
  code, but it works with
  L<modules|perl/Modules__Libraries_and_Program_Files> only.  The only way to
  pass a module to load is by its module name and not its filename.  If
  the module is located in I<MyCode.pm>, the correct way to use() it is:
  
    use MyCode
  
  and not:
  
    use "MyCode.pm"
  
  use() translates the passed argument into a file name replacing C<::>
  with the operating system's path separator (normally C</>) and
  appending I<.pm> at the end. So C<My::Module> becomes I<My/Module.pm>.
  
  use() is exactly equivalent to:
  
   BEGIN { require Module; Module->import(LIST); }
  
  Internally it calls require() to do the loading and compilation
  chores. When require() finishes its job, import() is called unless
  C<()> is the second argument. The following pairs are equivalent:
  
    use MyModule;
    BEGIN {require MyModule; MyModule->import; }
    
    use MyModule qw(foo bar);
    BEGIN {require MyModule; MyModule->import("foo","bar"); }
    
    use MyModule ();
    BEGIN {require MyModule; }
  
  The first pair exports the default tags. This happens if the module
  sets C<@EXPORT> to a list of tags to be exported by default. The
  module's manpage normally describes what tags are exported by
  default.
  
  The second pair exports only the tags passed as arguments. 
  
  The third pair describes the case where the caller does not want any
  symbols to be imported.
  
  C<import()> is not a builtin function, it's just an ordinary static
  method call into the "C<MyModule>" package to tell the module to
  import the list of features back into the current package. See the
  Exporter manpage for more information.
  
  When you write your own modules, always remember that it's better to
  use C<@EXPORT_OK> instead of C<@EXPORT>, since the former doesn't
  export symbols unless it was asked to. Exports pollute the namespace
  of the module user. Also avoid short or common symbol names to reduce
  the risk of name clashes.
  
  When functions and variables aren't exported you can still access them
  using their full names, like C<$My::Module::bar> or
  C<$My::Module::foo()>.  By convention you can use a leading underscore
  on names to informally indicate that they are I<internal> and not for
  public use.
  
  There's a corresponding "C<no>" command that un-imports symbols
  imported by C<use>, i.e., it calls C<Module-E<gt>unimport(LIST)>
  instead of C<import()>.
  
  
  =head2 do()
  
  While do() behaves almost identically to require(), it reloads the
  file unconditionally. It doesn't check C<%INC> to see whether the file
  was already loaded.
  
  If do() cannot read the file, it returns C<undef> and sets C<$!> to
  report the error.  If do() can read the file but cannot compile it, it
  returns C<undef> and puts an error message in C<$@>. If the file is
  successfully compiled, do() returns the value of the last expression
  evaluated.
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  =head1 Using Global Variables and Sharing Them Between Modules/Packages
  
  It helps when you code your application in a structured way, using the
  perl packages, but as you probably know once you start using packages
  it's much harder to share the variables between the various
  packagings. A configuration package comes to mind as a good example of
  the package that will want its variables to be accessible from the
  other modules.
  
  Of course using the Object Oriented (OO) programming is the best way
  to provide an access to variables through the access methods. But if
  you are not yet ready for OO techniques you can still benefit from
  using the techniques we are going to talk about.
  
  =head2 Making Variables Global
  
  When you first wrote C<$x> in your code you created a (package) global
  variable.  It is visible everywhere in your program, although if used
  in a package other than the package in which it was declared
  (C<main::> by default), it must be referred to with its fully
  qualified name, unless you have imported this variable with
  import(). This will work only if you do not use C<strict> pragma; but
  you I<have> to use this pragma if you want to run your scripts under
  mod_perl. Read L<The strict
  pragma|porting/The_strict_pragma> to find out why. 
  
  =head2 Making Variables Global With strict Pragma On
  
  First you use :
  
    use strict;
  
  Then you use:
  
   use vars qw($scalar %hash @array);
  
  This declares the named variables as package globals in the current
  package.  They may be referred to within the same file and package
  with their unqualified names; and in different files/packages with
  their fully qualified names. 
  
  With perl5.6 you can use the C<our> operator instead:
  
    our qw($scalar %hash @array);
  
  If you want to share package global variables between packages, here
  is what you can do.
  
  =head2 Using Exporter.pm to Share Global Variables
  
  Assume that you want to share the C<CGI.pm> object (I will use C<$q>)
  between your modules. For example, you create it in C<script.pl>, but
  you want it to be visible in C<My::HTML>. First, you make C<$q>
  global.
  
    script.pl:
    ----------------
    use vars qw($q);
    use CGI;
    use lib qw(.); 
    use My::HTML qw($q); # My/HTML.pm is in the same dir as script.pl
    $q = CGI->new;
    
    My::HTML::printmyheader();
  
  Note that we have imported C<$q> from C<My::HTML>. And C<My::HTML>
  does the export of C<$q>:
  
    My/HTML.pm
    ----------------
    package My::HTML;
    use strict;
    
    BEGIN {
      use Exporter ();
    
      @My::HTML::ISA         = qw(Exporter);
      @My::HTML::EXPORT      = qw();
      @My::HTML::EXPORT_OK   = qw($q);
    
    }
    
    use vars qw($q);
    
    sub printmyheader{
      # Whatever you want to do with $q... e.g.
      print $q->header();
    }
    1;
  
  So the C<$q> is shared between the C<My::HTML> package and
  C<script.pl>. It will work vice versa as well, if you create the
  object in C<My::HTML> but use it in C<script.pl>. You have true
  sharing, since if you change C<$q> in C<script.pl>, it will be changed
  in C<My::HTML> as well.
  
  What if you need to share C<$q> between more than two packages? For
  example you want My::Doc to share C<$q> as well.
  
  You leave C<My::HTML> untouched, and modify I<script.pl> to include:
  
   use My::Doc qw($q);
  
  Then you add the same C<Exporter> code that we used in C<My::HTML>,
  into C<My::Doc>, so that it also exports C<$q>.
  
  One possible pitfall is when you want to use C<My::Doc> in both
  C<My::HTML> and I<script.pl>. Only if you add
  
    use My::Doc qw($q);
  
  into C<My::HTML> will C<$q> be shared. Otherwise C<My::Doc> will not
  share C<$q> any more. To make things clear here is the code:
  
    script.pl:
    ----------------
    use vars qw($q);
    use CGI;
    use lib qw(.); 
    use My::HTML qw($q); # My/HTML.pm is in the same dir as script.pl
    use My::Doc  qw($q); # Ditto
    $q = new CGI;
    
    My::HTML::printmyheader();
  
  
    My/HTML.pm
    ----------------
    package My::HTML;
    use strict;
    
    BEGIN {
      use Exporter ();
    
      @My::HTML::ISA         = qw(Exporter);
      @My::HTML::EXPORT      = qw();
      @My::HTML::EXPORT_OK   = qw($q);
    
    }
    
    use vars     qw($q);
    use My::Doc  qw($q);
    
    sub printmyheader{
      # Whatever you want to do with $q... e.g.
      print $q->header();
    
      My::Doc::printtitle('Guide');
    }
    1;
  
    My/Doc.pm
    ----------------
    package My::Doc;
    use strict;
    
    BEGIN {
      use Exporter ();
    
      @My::Doc::ISA         = qw(Exporter);
      @My::Doc::EXPORT      = qw();
      @My::Doc::EXPORT_OK   = qw($q);
    
    }
    
    use vars qw($q);
    
    sub printtitle{
      my $title = shift || 'None';
      
      print $q->h1($title);
    }
    1;
  
  
  =head2 Using the Perl Aliasing Feature to Share Global Variables
  
  As the title says you can import a variable into a script or module
  without using C<Exporter.pm>. I have found it useful to keep all the
  configuration variables in one module C<My::Config>. But then I have
  to export all the variables in order to use them in other modules,
  which is bad for two reasons: polluting other packages' name spaces
  with extra tags which increases the memory requirements; and adding
  the overhead of keeping track of what variables should be exported
  from the configuration module and what imported, for some particular
  package.  I solve this problem by keeping all the variables in one
  hash C<%c> and exporting that. Here is an example of C<My::Config>:
  
    package My::Config;
    use strict;
    use vars qw(%c);
    %c = (
      # All the configs go here
      scalar_var => 5,
    
      array_var  => [qw(foo bar)],
    
      hash_var   => {
                     foo => 'Foo',
                     bar => 'BARRR',
                    },
    );
    1;
  
  Now in packages that want to use the configuration variables I have
  either to use the fully qualified names like C<$My::Config::test>,
  which I dislike or import them as described in the previous section.
  But hey, since we have only one variable to handle, we can make things
  even simpler and save the loading of the C<Exporter.pm> package. We
  will use the Perl aliasing feature for exporting and saving the
  keystrokes:
  
    package My::HTML;
    use strict;
    use lib qw(.);
      # Global Configuration now aliased to global %c
    use My::Config (); # My/Config.pm in the same dir as script.pl
    use vars qw(%c);
    *c = \%My::Config::c;
    
      # Now you can access the variables from the My::Config
    print $c{scalar_var};
    print $c{array_var}[0];
    print $c{hash_var}{foo};
  
  Of course $c is global everywhere you use it as described above, and
  if you change it somewhere it will affect any other packages you have
  aliased C<$My::Config::c> to.
  
  Note that aliases work either with global or C<local()> vars - you
  cannot write:
  
    my *c = \%My::Config::c; # ERROR!
  
  Which is an error. But you can write:
  
    local *c = \%My::Config::c;
  
  For more information about aliasing, refer to the Camel book, second
  edition, pages 51-52.
  
  =head2 Using Non-Hardcoded Configuration Module Names
  
  You have just seen how to use a configuration module for configuration
  centralization and an easy access to the information stored in this
  module. However, there is somewhat of a chicken-and-egg problem--how
  to let your other modules know the name of this file? Hardcoding the
  name is brittle--if you have only a single project it should be fine,
  but if you have more projects which use different configurations and
  you will want to reuse their code you will have to find all instances
  of the hardcoded name and replace it. 
  
  Another solution could be to have the same name for a configuration
  module, like C<My::Config> but putting a different copy of it into
  different locations. But this won't work under mod_perl because of the
  namespace collision. You cannot load different modules which uses the
  same name, only the first one will be loaded.
  
  Luckily, there is another solution which allows us to stay flexible.
  C<PerlSetVar> comes to rescue. Just like with environment variables,
  you can set server's global Perl variables which can be retrieved from
  any module and script. Those statements are placed into the
  I<httpd.conf> file. For example
  
    PerlSetVar FooBaseDir       /home/httpd/foo
    PerlSetVar FooConfigModule  Foo::Config
  
  Now we require() the file where the above configuration will be used.
  
    PerlRequire /home/httpd/perl/startup.pl
  
  In the I<startup.pl> we might have the following code:
  
      # retrieve the configuration module path
    use Apache:
    my $s             = Apache->server;
    my $base_dir      = $s->dir_config('FooBaseDir')      || '';
    my $config_module = $s->dir_config('FooConfigModule') || '';
    die "FooBaseDir and FooConfigModule aren't set in httpd.conf" 
      unless $base_dir and $config_module;
    
      # build the real path to the config module
    my $path = "$base_dir/$config_module";
    $path =~ s|::|/|;
    $path .= ".pm";
      # we have something like "/home/httpd/foo/Foo/Config.pm"
    
      # now we can pull in the configuration module
    require $path;
  
  Now we know the module name and it's loaded, so for example if we need
  to use some variables stored in this module to open a database
  connection, we will do:
  
    Apache::DBI->connect_on_init
    ("DBI:mysql:${$config_module.'::DB_NAME'}::${$config_module.'::SERVER'}",
     ${$config_module.'::USER'},
     ${$config_module.'::USER_PASSWD'},
     {
      PrintError => 1, # warn() on errors
      RaiseError => 0, # don't die on error
      AutoCommit => 1, # commit executes immediately
     }
    );
  
  Where variable like:
  
    ${$config_module.'::USER'}
  
  In our example are really:
  
    $Foo::Config::USER
  
  If you want to access these variable from within your code at the run
  time, instead accessing to the server object C<$c>, use the request
  object C<$r>:
  
    my $r = shift;
    my $base_dir      = $r->dir_config('FooBaseDir')      || '';
    my $config_module = $r->dir_config('FooConfigModule') || '';
    
  
  
  =head1 The Scope of the Special Perl Variables
  
  Special Perl variables like C<$|> (buffering), C<$^T> (script's start
  time), C<$^W> (warnings mode), C<$/> (input record separator), C<$\>
  (output record separator) and many more are all true global variables;
  they do not belong to any particular package (not even C<main::>) and
  are universally available. This means that if you change them, you
  change them anywhere across the entire program; furthermore you cannot
  scope them with my(). However you can local()ise them which means that
  any changes you apply will only last until the end of the enclosing
  scope. In the mod_perl situation where the child server doesn't
  usually exit, if in one of your scripts you modify a global variable
  it will be changed for the rest of the process' life and will affect
  all the scripts executed by the same process. Therefore localizing
  these variables is highly recommended, I'd say mandatory.
  
  We will demonstrate the case on the input record separator
  variable. If you undefine this variable, the diamond operator
  (readline) will suck in the whole file at once if you have enough
  memory. Remembering this you should never write code like the example
  below.
  
    $/ = undef; # BAD!
    open IN, "file" ....
      # slurp it all into a variable
    $all_the_file = <IN>;
  
  The proper way is to have a local() keyword before the special
  variable is changed, like this:
  
    local $/ = undef; 
    open IN, "file" ....
      # slurp it all inside a variable
    $all_the_file = <IN>;
  
  But there is a catch. local() will propagate the changed value to 
  the code below it.  The modified value will be in effect until the
  script terminates, unless it is changed again somewhere else in the
  script.
  
  A cleaner approach is to enclose the whole of the code that is
  affected by the modified variable in a block, like this:
  
    {
      local $/ = undef; 
      open IN, "file" ....
        # slurp it all inside a variable
      $all_the_file = <IN>;
    }
  
  That way when Perl leaves the block it restores the original value of
  the C<$/> variable, and you don't need to worry elsewhere in your
  program about its value being changed here.
  
  Note that if you call a subroutine after you've set a global variable
  but within the enclosing block, the global variable will be visible
  with its new value inside the subroutine.
  
  
  
  
  
  
  
  
  
  
  
  
  
  =head1 Compiled Regular Expressions 
  
  When using a regular expression that contains an interpolated Perl
  variable, if it is known that the variable (or variables) will not
  change during the execution of the program, a standard optimization
  technique is to add the C</o> modifier to the regex pattern.  This
  directs the compiler to build the internal table once, for the entire
  lifetime of the script, rather than every time the pattern is
  executed. Consider:
  
    my $pat = '^foo$'; # likely to be input from an HTML form field
    foreach( @list ) {
      print if /$pat/o;
    }
  
  This is usually a big win in loops over lists, or when using the
  C<grep()> or C<map()> operators.
  
  In long-lived mod_perl scripts, however, the variable may change with
  each invocation and this can pose a problem. The first invocation of a
  fresh httpd child will compile the regex and perform the search
  correctly. However, all subsequent uses by that child will continue to
  match the original pattern, regardless of the current contents of the
  Perl variables the pattern is supposed to depend on. Your script will
  appear to be broken.
  
  There are two solutions to this problem:
  
  The first is to use C<eval q//>, to force the code to be evaluated
  each time. Just make sure that the eval block covers the entire loop
  of processing, and not just the pattern match itself.
  
  The above code fragment would be rewritten as: 
  
    my $pat = '^foo$';
    eval q{
      foreach( @list ) {
        print if /$pat/o;
      }
    }
  
  Just saying:
  
    foreach( @list ) {
      eval q{ print if /$pat/o; };
    }
  
  means that we recompile the regex for every element in the list even
  though the regex doesn't change.
  
  You can use this approach if you require more than one pattern match
  operator in a given section of code. If the section contains only one
  operator (be it an C<m//> or C<s///>), you can rely on the property of the
  null pattern, that reuses the last pattern seen. This leads to the
  second solution, which also eliminates the use of eval.
  
  The above code fragment becomes: 
  
    my $pat = '^foo$';
    "something" =~ /$pat/; # dummy match (MUST NOT FAIL!)
    foreach( @list ) {
      print if //;
    }
  
  The only gotcha is that the dummy match that boots the regular
  expression engine must absolutely, positively succeed, otherwise the
  pattern will not be cached, and the C<//> will match everything. If you
  can't count on fixed text to ensure the match succeeds, you have two
  possibilities.
  
  If you can guarantee that the pattern variable contains no
  meta-characters (things like *, +, ^, $...), you can use the dummy
  match:
  
    $pat =~ /\Q$pat\E/; # guaranteed if no meta-characters present
  
  If there is a possibility that the pattern can contain
  meta-characters, you should search for the pattern or the non-searchable
  \377 character as follows:
  
    "\377" =~ /$pat|^\377$/; # guaranteed if meta-characters present
  
  Another approach:
  
  It depends on the complexity of the regex to which you apply this
  technique.  One common usage where a compiled regex is usually more
  efficient is to "I<match any one of a group of patterns>" over and
  over again.
  
  Maybe with a helper routine, it's easier to remember.  Here is one
  slightly modified from Jeffery Friedl's example in his book
  "I<Mastering Regular Expressions>".
  
    #####################################################
    # Build_MatchMany_Function
    # -- Input:  list of patterns
    # -- Output: A code ref which matches its $_[0]
    #            against ANY of the patterns given in the
    #            "Input", efficiently.
    #
    sub Build_MatchMany_Function {
      my @R = @_;
      my $expr = join '||', map { "\$_[0] =~ m/\$R[$_]/o" } ( 0..$#R );
      my $matchsub = eval "sub { $expr }";
      die "Failed in building regex @R: $@" if $@;
      $matchsub;
    }
  
  Example usage:
  
    @some_browsers = qw(Mozilla Lynx MSIE AmigaVoyager lwp libwww);
    $Known_Browser=Build_MatchMany_Function(@some_browsers);
    
    while (<ACCESS_LOG>) {
      # ...
      $browser = get_browser_field($_);
      if ( ! &$Known_Browser($browser) ) {
        print STDERR "Unknown Browser: $browser\n";
      }
      # ...
    }
  
  And of course you can use the qr() operator which makes the code even
  more efficient:
  
    my $pat = '^foo$';
    my $re  = qr($pat);
    foreach( @list ) {
        print if /$re/o;
      }
  
  The qr() operator compiles the pattern for each request and then use
  the compiled version in the actual match.
  
  
  
  
  
  
  
  =head1 Exception Handling for mod_perl
  
  Here are some guidelines for S<clean(er)> exception handling in
  mod_perl, although the technique presented can be applied to all of
  your Perl programming.
  
  The reasoning behind this document is the current broken status of
  C<$SIG{__DIE__}> in the perl core - see both the perl5-porters and the
  mod_perl mailing list archives for details on this discussion. (It's
  broken in at least Perl v5.6.0 and probably in later versions as
  well). In short summary, $SIG{__DIE__} is a little bit too global, and
  catches exceptions even when you want to catch them yourself, using
  an C<eval{}> block.
  
  =head2 Trapping Exceptions in Perl
  
  To trap an exception in Perl we use the C<eval{}> construct. Many
  people initially make the mistake that this is the same as the C<eval
  EXPR> construct, which compiles and executes code at run time, but
  that's not the case. C<eval{}> compiles at compile time, just like the
  rest of your code, and has next to zero run-time penalty. For the
  hardcore C programmers among you, it uses the C<setjmp/longjmp> POSIX
  routines internally, just like C++ exceptions.
  
  When in an eval block, if the code being executed die()'s for any
  reason, an exception is thrown. This exception can be caught by
  examining the C<$@> variable immediately after the eval block; if
  C<$@> is true then an exception occurred and C<$@> contains the
  exception in the form of a string.  The full construct looks like
  this:
  
    eval {
        # Some code here
    }; # Note important semi-colon there
    if ($@) # $@ contains the exception that was thrown
    {
        # Do something with the exception
    }
    else # optional
    {
        # No exception was thrown
    }
  
  Most of the time when you see these exception handlers there is no
  else block, because it tends to be OK if the code didn't throw an
  exception.
  
  Perl's exception handling is similar to that of other languages, though it may
  not seem so at first sight:
  
    Perl                             Other language
    -------------------------------  ------------------------------------
    eval {                           try {
      # execute here                   // execute here
      # raise our own exception:       // raise our own exception:
      die "Oops" if /error/;           if(error==1){throw Exception.Oops;}
      # execute more                   // execute more
    } ;                              }
    if($@) {                         catch {
      # handle exceptions              switch( Exception.id ) {
      if( $@ =~ /Fail/ ) {               Fail : fprintf( stderr, "Failed\n" ) ;
          print "Failed\n" ;                    break ;
      }
      elsif( $@ =~ /Oops/ ) {            Oops : throw Exception ;
          # Pass it up the chain                 
          die if $@ =~ /Oops/;
      }
      else {                             default :
          # handle all other           }
          # exceptions here          }
      }                              // If we got here all is OK or handled
    }
    else { # optional
      # all is well
    }
    # all is well or has been handled
  
  
  =head2 Alternative Exception Handling Techniques
  
  An often suggested method for handling global exceptions in mod_perl,
  and other perl programs in general, is a B<__DIE__> handler, which can
  be set up by either assigning a function name as a string to
  C<$SIG{__DIE__}> (not particularly recommended, because of the
  possible namespace clashes) or assigning a code reference to
  C<$SIG{__DIE__}>. The usual way of doing so is to use an anonymous
  subroutine:
  
    $SIG{__DIE__} = sub { print "Eek - we died with:\n", $_[0]; };
  
  The current problem with this is that C<$SIG{__DIE__}> is a global
  setting in your script, so while you can potentially hide away your
  exceptions in some external module, the execution of C<$SIG{__DIE__}>
  is fairly magical, and interferes not just with your code, but with
  all code in every module you import. Beyond the magic involved,
  C<$SIG{__DIE__}> actually interferes with perl's normal exception
  handling mechanism, the C<eval{}> construct. Witness:
  
    $SIG{__DIE__} = sub { print "handler\n"; };
    
    eval {
        print "In eval\n";
        die "Failed for some reason\n";
    };
    if ($@) {
        print "Caught exception: $@";
    }
  
  The code unfortunately prints out:
  
    In eval
    handler
  
  Which isn't quite what you would expect, especially if that
  C<$SIG{__DIE__}> handler is hidden away deep in some other module that
  you didn't know about. There are work arounds however. One is to
  localize C<$SIG{__DIE__}> in every exception trap you write:
  
    eval {
        local $SIG{__DIE__};
        ...
    };
  
  Obviously this just doesn't scale - you don't want to be doing that
  for every exception trap in your code, and it's a slow down. A second
  work around is to check in your handler if you are trying to catch
  this exception:
  
    $SIG{__DIE__} = sub {
        die $_[0] if $^S;
        print "handler\n";
    };
  
  However this won't work under C<Apache::Registry> - you're always in
  an eval block there!
  
  The other problem with C<$SIG{__DIE__}> also relates to its global nature.
  Because you might have more than one application running under mod_perl,
  you can't be sure which has set a C<$SIG{__DIE__}> handler when and for
  what. This can become extremely confusing when you start scaling up
  from a set of simple registry scripts that might rely on CGI::Carp for
  global exception handling (which uses C<$SIG{__DIE__}> to trap exceptions)
  to having many applications installed with a variety of exception
  handling mechanisms in place.
  
  You should warn people about this danger of C<$SIG{__DIE__}> and
  inform them of better ways to code. The following material is an
  attempt to do just that.
  
  
  
  =head2 Better Exception Handling
  
  The C<eval{}> construct in itself is a fairly weak way to handle
  exceptions as strings. There's no way to pass more information in your
  exception, so you have to handle your exception in more than one place
  - at the location the error occurred, in order to construct a sensible
  error message, and again in your exception handler to de-construct
  that string into something meaningful (unless of course all you want
  your exception handler to do is dump the error to the browser). The
  other problem is that you have no way of automatically detecting where
  the exception occurred using C<eval{}> construct. In a C<$SIG{__DIE__}>
  block you always have the use of the caller() function to detect where
  the error occurred. But we can fix that...
  
  A little known fact about exceptions in perl 5.005 is that you can
  call die with an object. The exception handler receives that object in
  C<$@>. This is how you are advised to handle exceptions now, as it
  provides an extremely flexible and scalable exceptions solution, potentially
  providing almost all of the power Java exceptions.
  
  [As a footnote here, the only thing that is really missing here from
  Java exceptions is a guaranteed Finally clause, although its possible
  to get about 98.62% of the way towards providing that using C<eval{}>.]
  
  =head3 A Little Housekeeping
  
  First though, before we delve into the details, a little housekeeping
  is in order. Most, if not all, mod_perl programs consist of a main
  routine that is entered, and then dispatches itself to a routine
  depending on the parameters passed and/or the form values. In a normal
  C program this is your main() function, in a mod_perl handler this is
  your handler() function/method. The exception to this rule seems to be
  Apache::Registry scripts, although the techniques described here can
  be easily adapted.
  
  In order for you to be able to use exception handling to its best
  advantage you need to change your script to have some sort of global
  exception handling. This is much more trivial than it sounds. If
  you're using C<Apache::Registry> to emulate CGI you might consider
  wrapping your entire script in one big eval block, but I would
  discourage that. A better method would be to modularize your script
  into discrete function calls, one of which should be a dispatch
  routine:
  
    #!/usr/bin/perl -w
    # Apache::Registry script
    
    eval {
       dispatch();
    };
    if ($@) {
       # handle exception
    }
    
    sub dispatch {
        ...
    }
    
  This is easier with an ordinary mod_perl handler as it is natural to
  have separate functions, rather than a long run-on script:
  
    MyHandler.pm
    ------------
    sub handler {
        my $r = shift;
        
        eval {
           dispatch($r);
        };
        if ($@) {
           # handle exception
        }
    }
    
    sub dispatch {
        my $r = shift;
        ...
    }
    
  Now that the skeleton code is setup, let's create an exception class,
  making use of Perl 5.005's ability to throw exception objects.
  
  =head3 An Exception Class
  
  This is a really simple exception class, that does nothing but contain
  information. A better implementation would probably also handle its
  own exception conditions, but that would be more complex, requiring
  separate packages for each exception type.
  
    My/Exception.pm
    ---------------
    package My::Exception;
    
    sub AUTOLOAD {
        no strict 'refs', 'subs';
        if ($AUTOLOAD =~ /.*::([A-Z]\w+)$/) {
            my $exception = $1;
            *{$AUTOLOAD} = 
                sub {
                    shift;
                    my ($package, $filename, $line) = caller;
                    push @_, caller => {
                                    package => $package,
                                    filename => $filename,
                                    line => $line,
                                      };
                    bless { @_ }, "My::Exception::$exception"; 
                };
            goto &{$AUTOLOAD};
        }
        else {
            die "No such exception class: $AUTOLOAD\n";
        }
    }
    
    1;
  
  OK, so this is all highly magical, but what does it do? It creates a
  simple package that we can import and use as follows:
  
    use My::Exception;
    
    die My::Exception->SomeException( foo => "bar" );
  
  The exception class tracks exactly where we died from using the
  caller() mechanism, it also caches exception classes so that
  C<AUTOLOAD> is only called the first time (in a given process) an
  exception of a particular type is thrown (particularly relevant under
  mod_perl).
  
  =head2 Catching Uncaught Exceptions
  
  What about exceptions that are thrown outside of your control? We can
  fix this using one of two possible methods. The first is to override
  die globally using the old magical C<$SIG{__DIE__}>, and the second,
  is the cleaner non-magical method of overriding the global die()
  method to your own die() method that throws an exception that makes
  sense to your application.
  
  =head3 Using $SIG{__DIE__}
  
  Overloading using C<$SIG{__DIE__}> in this case is rather simple,
  here's some code:
  
    $SIG{__DIE__} = sub {
        if(!ref($_[0])) {
            $err = My::Exception->UnCaught(text => join('', @_));
        }
        die $err;
    };
  
  All this does is catch your exception and re-throw it. It's not as
  dangerous as we stated earlier that C<$SIG{__DIE__}> can be, because
  we're actually re-throwing the exception, rather than catching it and
  stopping there. Even though $SIG{__DIE__} is a global handler, because
  we are simply re-throwing the exception we can let other applications
  outside of our control simply catch the exception and not worry about
  it.
  
  There's only one slight buggette left, and that's if some external
  code die()'ing catches the exception and tries to do string
  comparisons on the exception, as in:
  
    eval {
        ... # some code
        die "FATAL ERROR!\n";
    };
    if ($@) {
        if ($@ =~ /^FATAL ERROR/) {
            die $@;
        }
    }
  
  In order to deal with this, we can overload stringification for our
  C<My::Exception::UnCaught> class:
  
    {
        package My::Exception::UnCaught;
        use overload '""' => \&str;
    
        sub str {
            shift->{text};
        }
    }
  
  We can now let other code happily continue. Note that there is a bug in
  Perl 5.6 which may affect people here: Stringification does not occur
  when an object is operated on by a regular expression (via the =~ operator).
  A work around is to explicitly stringify using qq double quotes, however
  that doesn't help the poor soul who is using other applications. This bug
  has been fixed in later versions of Perl.
  
  =head3 Overriding the Core die() Function
  
  So what if we don't want to touch C<$SIG{__DIE__}> at all? We can
  overcome this by overriding the core die function. This is slightly
  more complex than implementing a C<$SIG{__DIE__}> handler, but is far
  less magical, and is the right thing to do, according to the
  L<perl5-porters mailing list|help/Get_help_with_Perl>.
  
  Overriding core functions has to be done from an external
  package/module. So we're going to add that to our C<My::Exception>
  module. Here's the relevant parts:
  
    use vars qw/@ISA @EXPORT/;
    use Exporter;
    
    @EXPORT = qw/die/;
    @ISA = 'Exporter';
    
    sub die (@); # prototype to match CORE::die
    
    sub import {
        my $pkg = shift;
        $pkg->export('CORE::GLOBAL', 'die');
        Exporter::import($pkg,@_);
    }
    
    sub die (@) {
        if (!ref($_[0])) {
            CORE::die My::Exception->UnCaught(text => join('', @_));
        }
        CORE::die $_[0]; # only use first element because its an object
    }
  
  That wasn't so bad, was it? We're relying on Exporter's export()
  function to do the hard work for us, exporting the die() function into
  the C<CORE::GLOBAL> namespace. If we don't want to overload die() everywhere
  this can still be an extremely useful technique. By just using Exporter's
  default import() method we can export our new die() method into any package
  of our choosing. This allows us to short-cut the long calling convention
  and simply die() with a string, and let the system handle the actual 
  construction into an object for us.
  
  Along with the above overloaded stringification, we now have a complete
  exception system (well, mostly complete. Exception die-hards would argue that
  there's no "finally" clause, and no exception stack, but that's another topic
  for another time).
  
  
  =head2 A Single UnCaught Exception Class
  
  Until the Perl core gets its own base exception class (which will likely 
happen
  for Perl 6, but not sooner), it is vitally important that you decide upon a
  single base exception class for all of the applications that you install on
  your server, and a single exception handling technique. The problem comes when
  you have multiple applications all doing exception handling and all expecting 
a
  certain type of "UnCaught" exception class. Witness the following application:
  
    package Foo;
    
    eval {
       # do something
    }
    if ($@) {
       if ([EMAIL PROTECTED]>isa('Foo::Exception::Bar')) {
          # handle "Bar" exception
       }
       elsif ([EMAIL PROTECTED]>isa('Foo::Exception::UnCaught')) {
          # handle uncaught exceptions
       }
    }
  
  All will work well until someone installs application "TrapMe" on the
  same machine, which installs its own UnCaught exception handler, 
  overloading CORE::GLOBAL::die or installing a $SIG{__DIE__} handler.
  This is actually a case where using $SIG{__DIE__} might actually be
  preferable, because you can change your handler() routine to look like
  this:
  
    sub handler {
        my $r = shift;
        
        local $SIG{__DIE__};
        Foo::Exception->Init(); # sets $SIG{__DIE__}
        
        eval {
           dispatch($r);
        };
        if ($@) {
           # handle exception
        }
    }
    
    sub dispatch {
        my $r = shift;
        ...
    }
  
  In this case the very nature of $SIG{__DIE__} being a lexical variable
  has helped us, something we couldn't fix with overloading 
  CORE::GLOBAL::die. However there is still a gotcha. If someone has
  overloaded die() in one of the applications installed on your mod_perl
  machine, you get the same problems still. So in short: Watch out, and
  check the source code of anything you install to make sure it follows
  your exception handling technique, or just uses die() with strings.
  
  
  =head2 Some Uses
  
  I'm going to come right out and say now: I abuse this system horribly!
  I throw exceptions all over my code, not because I've hit an
  "exceptional" bit of code, but because I want to get straight back out
  of the current call stack, without having to have every single level of
  function call check error codes. One way I use this is to return
  Apache return codes:
  
    # paranoid security check
    die My::Exception->RetCode(code => 204);
  
  Returns a 204 error code (C<HTTP_NO_CONTENT>), which is caught at my
  top level exception handler:
  
    if ([EMAIL PROTECTED]>isa('My::Exception::RetCode')) {
        return [EMAIL PROTECTED]>{code};
    }
  
  That last return statement is in my handler() method, so that's the
  return code that Apache actually sends. I have other exception
  handlers in place for sending Basic Authentication headers and
  Redirect headers out. I also have a generic C<My::Exception::OK>
  class, which gives me a way to back out completely from where I am,
  but register that as an OK thing to do.
  
  Why do I go to these extents? After all, code like slashcode (the code
  behind http://slashdot.org) doesn't need this sort of thing, so why
  should my web site? Well it's just a matter of scalability and
  programmer style really. There's a lot of literature out there about
  exception handling, so I suggest doing some research.
  
  
  =head2 Conclusions
  
  Here I've demonstrated a simple and scalable (and useful) exception
  handling mechanism, that fits perfectly with your current code, and
  provides the programmer with an excellent means to determine what has
  happened in his code. Some users might be worried about the overhead
  of such code. However in use I've found accessing the database to be a
  much more significant overhead, and this is used in some code
  delivering to thousands of users.
  
  For similar exception handling techniques, see the section "L<Other
  Implementations|perl/Other_Implementations>".
  
  =head2 The My::Exception class in its entirety
  
    package My::Exception
    
    use vars qw/@ISA @EXPORT $AUTOLOAD/;
    use Exporter;
    @ISA = 'Exporter';
    @EXPORT = qw/die/;
    
    sub die (@);
    
    sub import {
        my $pkg = shift;
        # allow "use My::Exception 'die';" to mean import locally only
        $pkg->export('CORE::GLOBAL', 'die') unless @_;
        Exporter::import($pkg,@_);
    }
      
    sub die (@) {
        if (!ref($_[0])) {
            CORE::die My::Exception->UnCaught(text => join('', @_));
        }
        CORE::die $_[0];
    }
    
    {
        package My::Exception::UnCaught;
        use overload '""' => sub { shift->{text} } ; 
    }
    
    sub AUTOLOAD {
        no strict 'refs', 'subs';
        if ($AUTOLOAD =~ /.*::([A-Z]\w+)$/) {
            my $exception = $1;
            *{$AUTOLOAD} = 
                sub {
                    shift;
                    my ($package, $filename, $line) = caller;
                    push @_, caller => {
                                    package => $package,
                                    filename => $filename,
                                    line => $line,
                                        };
                    bless { @_ }, "My::Exception::$exception"; 
                };
            goto &{$AUTOLOAD};
        }
        else {
            die "No such exception class: $AUTOLOAD\n";
        }
    }
    
    1;
  
  =head2 Other Implementations
  
  Some users might find it very useful to have the more C++/Java like
  interface of try/catch functions. These are available in several forms
  that all work in slightly different ways. See the documentation for
  each module for details:
  
  =over 
  
  =item * Error.pm
  
  Graham Barr's excellent OO styled "try, throw, catch" module (from
  L<CPAN|download/Perl>). This should be considered your best option
  for structured exception handling because it is well known and well
  supported and used by a lot of other applications.
  
  =item * Exception::Class and Devel::StackTrace
  
  by Dave Rolsky both available from CPAN of course.
  
  C<Exception::Class> is a bit cleaner than the C<AUTOLOAD> method from
  above as it can catch typos in exception class names, whereas the
  method above will automatically create a new class for you.  In
  addition, it lets you create actual class hierarchies for your
  exceptions, which can be useful if you want to create exception
  classes that provide extra methods or data.  For example, an exception
  class for database errors could provide a method for returning the SQL
  and bound parameters in use at the time of the error.
  
  
  =item * Try.pm
  
  Tony Olekshy's. Adds an unwind stack and some other interesting
  features.  Not on the CPAN. Available at
  http://www.avrasoft.com/perl/rfc/try-1136.zip
  
  =back
  
  =cut
  
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/porting.pod
  
  Index: porting.pod
  ===================================================================
  =head1 NAME
  
  CGI to mod_perl Porting. mod_perl Coding guidelines.
  
  =head1 Document Coverage
  
  This chapter is relevant to both writing a new CGI script or perl
  handler from scratch and migrating an application from plain CGI to
  mod_perl.
  
  It also addresses the situation where the CGI script being ported does
  the job, but is too dirty to be altered easily to run as a mod_perl
  program. (C<Apache::PerlRun> mode)
  
  If you are at the porting stage, you can use this chapter as a reference for
  possible problems you might encounter when running an existing CGI script in
  the new mode.
  
  If your project schedule is tight, I would suggest converting to
  mod_perl in the following steps: Initially, run all the scripts in the
  C<Apache::PerlRun> mode. Then as time allows, move them into
  C<Apache::Registry> mode. Later if you need Apache Perl API
  functionality you can always add it.
  
  If you are about to write a new CGI script from scratch, it would be a
  good idea to learn about possible mod_perl related pitfalls and to avoid them
  in the first place.
  
  If you don't need mod_cgi compatibility, it's a good idea to start
  writing using the mod_perl API in first place. This will make your
  application a little bit more efficient and it will be easier to use
  the full mod_perl feature set, which extends the core Perl
  functionality with Apache specific functions and overridden Perl core
  functions that were reimplemented to work better in mod_perl
  environment.
  
  
  
  
  
  
  
  
  
  
  
  =head1 Before you start to code
  
  It can be a good idea to tighten up some of your Perl programming
  practices, since mod_perl doesn't tolerate sloppy programming.
  
  This chapter relies on a certain level of Perl knowledge. Please
  read through the L<Perl Reference|perl/Perl_Reference_> chapter and make sure 
you
  know the material covered there. This will allow me to concentrate on
  pure mod_perl issues and make them more prominent to the experienced
  Perl programmer, which would otherwise be lost in the sea of Perl
  background notes.
  
  Additional resources:
  
  =over
  
  =item * Perl Module Mechanics 
  
  This page describes the mechanics of creating,
  compiling, releasing, and maintaining Perl modules.
  http://world.std.com/~swmcd/steven/perl/module_mechanics.html
  
  The information is very relevant to a mod_perl developer.
  
  =item * The Eagle Book 
  
  "Writing Apache Modules with Perl and C" is a "must have" book!
  
  See the details at http://www.modperl.com .
  
  =item * "Programming Perl" Book
  
  =item * "Perl Cookbook" Book
  
  =item * "Object Oriented Perl" Book
  
  =back
  
  
  
  
  
  
  
  
  
  
  
  =head1 Exposing Apache::Registry secrets
  
  Let's start with some simple code and see what can go wrong with it,
  detect bugs and debug them, discuss possible pitfalls and how to avoid
  them.
  
  I will use a simple CGI script, that initializes a C<$counter> to 0,
  and prints its value to the browser while incrementing it.
  
    counter.pl:
    ----------
    #!/usr/bin/perl -w
    use strict;
    
    print "Content-type: text/plain\r\n\r\n";
    
    my $counter = 0;
    
    for (1..5) {
      increment_counter();
    }
    
    sub increment_counter{
      $counter++;
      print "Counter is equal to $counter !\r\n";
    }
  
  
  You would expect to see the output:
  
    Counter is equal to 1 !
    Counter is equal to 2 !
    Counter is equal to 3 !
    Counter is equal to 4 !
    Counter is equal to 5 !
  
  And that's what you see when you execute this script the first
  time. But let's reload it a few times... See, suddenly after a few
  reloads the counter doesn't start its count from 1 any more. We
  continue to reload and see that it keeps on growing, but not steadily
  starting almost randomly at 10, 10, 10, 15, 20... Weird...
  
    Counter is equal to 6 !
    Counter is equal to 7 !
    Counter is equal to 8 !
    Counter is equal to 9 !
    Counter is equal to 10 !
  
  We saw two anomalies in this very simple script: Unexpected increment of our
  counter over 5 and inconsistent growth over reloads. Let's investigate
  this script.
  
  
  
  
  =head2 The First Mystery
  
  First let's peek into the C<error_log> file. Since we have enabled the
  warnings what we see is:
  
    Variable "$counter" will not stay shared 
    at /home/httpd/perl/conference/counter.pl line 13.
  
  The I<Variable "$counter" will not stay shared> warning is generated
  when the script contains a named nested subroutine (a named - as
  opposed to anonymous - subroutine defined inside another subroutine)
  that refers to a lexically scoped variable defined outside this nested
  subroutine.  This effect is explained in L<my() Scoped Variable in
  Nested Subroutines|perl/my_Scoped_Variable_in_Nested_S>.
  
  Do you see a nested named subroutine in my script? I don't!  What's
  going on? Maybe it's a bug? But wait, maybe the perl interpreter sees
  the script in a different way, maybe the code goes through some
  changes before it actually gets executed? The easiest way to check
  what's actually happening is to run the script with a debugger.
  
  But since we must debug it when it's being executed by the webserver,
  a normal debugger won't help, because the debugger has to be invoked
  from within the webserver. Luckily Doug MacEachern wrote the
  C<Apache::DB> module and we will use this to debug my script. While
  C<Apache::DB> allows you to debug the code interactively, we will do
  it non-interactively.
  
  Modify the C<httpd.conf> file in the following way:
  
    PerlSetEnv PERLDB_OPTS "NonStop=1 LineInfo=/tmp/db.out AutoTrace=1 frame=2"
    PerlModule Apache::DB
    <Location /perl>
      PerlFixupHandler Apache::DB
      SetHandler perl-script
      PerlHandler Apache::Registry
      Options ExecCGI
      PerlSendHeader On
    </Location>
  
  Restart the server and issue a request to I<counter.pl> as before. On
  the surface nothing has changed--we still see the correct output as
  before, but two things happened in the background:
  
  Firstly, the file I</tmp/db.out> was written, with a complete trace of
  the code that was executed.
  
  Secondly, if you have loaded the C<Carp> module already, I<error_log>
  now contains the real code that was actually executed.  This is
  produced as a side effect of reporting the I<Variable "$counter" will
  not stay shared at...> warning that we saw earlier. To load the Carp
  module, you can add:
  
    use Carp;
  
  in your I<startup.pl> file or in the executed code.
  
  Here is the code that was actually executed:
  
    package Apache::ROOT::perl::conference::counter_2epl;
    use Apache qw(exit);
    sub handler {
      BEGIN {
        $^W = 1;
      };
      $^W = 1;
      
      use strict;
      
      print "Content-type: text/plain\r\n\r\n";
      
      my $counter = 0;
      
      for (1..5) {
        increment_counter();
      }
      
      sub increment_counter{
        $counter++;
        print "Counter is equal to $counter !\r\n";
      }
    }
  
  The code in the I<error.log> wasn't indented. I've indented it for you to
  stress that the code was wrapped inside the handler() subroutine.
  
  What do we learn from this?
  
  Well firstly that every CGI script is cached under a package whose
  name is formed from the C<Apache::ROOT::> prefix and the relative part
  of the script's URL (C<perl::conference::counter_2epl>) by replacing
  all occurrences of C</> with C<::> and C<.> with C<_2e>. That's how
  mod_perl knows what script should be fetched from the cache--each
  script is just a package with a single subroutine named C<handler>.
  
  If we were to add C<use diagnostics> to the script we would also see a
  reference in the error text to an inner (nested)
  subroutine--C<increment_counter> is actually a nested subroutine.
  
  With mod_perl, each subroutine in every C<Apache::Registry> script is
  nested inside the C<handler> subroutine.
  
  It's important to understand that the I<inner subroutine> effect
  happens only with code that C<Apache::Registry> wraps with a
  declaration of the C<handler> subroutine. If you put your code into a
  library or module, which the main script require()'s or use()'s, this
  effect doesn't occur.
  
  For example if we move the code from the script into the subroutine
  I<run>, place the subroutines into the I<mylib.pl> file, save it in
  the same directory as the script itself and require() it, there will
  be no problem at all. (Don't forget the C<1;> at the end of the
  library or the require() might fail.)
  
    mylib.pl:
    ---------
    my $counter;
    sub run{
      print "Content-type: text/plain\r\n\r\n";
      $counter = 0;
      for (1..5) {
        increment_counter();
      }
    }
    sub increment_counter{
      $counter++;
      print "Counter is equal to $counter !\r\n";
    }
    1;
  
    counter.pl:
    ----------
    use strict;
    require "./mylib.pl";
    run();
  
  This solution provides the easiest and the fastest way to solve the
  nested subroutines problem, since all you have to do is to move the
  code into a separate file, by first wrapping the initial code into
  some function that you later will call from the script and keeping the
  lexically scoped variables that could cause the problem out of this
  function.
  
  But as a general rule of thumb, unless the script is very short, I
  tend to write all the code in external libraries, and to have only a
  few lines in the main script.  Generally the main script simply calls
  the main function of my library.  Usually I call it C<init()> or
  C<run()>.  I don't worry about nested subroutine effects anymore
  (unless I create them myself :).
  
  The section 'L<Remedies for Inner
  Subroutines|perl/Remedies_for_Inner_Subroutines>' discusses 
  many other possible workarounds for this problem.
  
  You shouldn't be intimidated by this issue at all, since Perl is your
  friend. Just keep the warnings mode B<On> and Perl will gladly tell
  you whenever you have this effect, by saying:
  
    Variable "$counter" will not stay shared at ...[snipped]
  
  Just don't forget to check your I<error_log> file, before going into
  production!
  
  By the way, the above example was pretty boring. In my first days of
  using mod_perl, I wrote a simple user registration program. I'll give
  a very simple representation of this program.
  
    use CGI;
    $q = CGI->new;
    my $name = $q->param('name');
    print_response();
    
    sub print_response{
      print "Content-type: text/plain\r\n\r\n";
      print "Thank you, $name!";
    }
  
  My boss and I checked the program at the development server and it
  worked OK. So we decided to put it in production.  Everything was OK,
  but my boss decided to keep on checking by submitting variations of
  his profile. Imagine the surprise when after submitting his name
  (let's say "The Boss" :), he saw the response "Thank you, Stas
  Bekman!".
  
  What happened is that I tried the production system as well. I was new
  to mod_perl stuff, and was so excited with the speed improvement that
  I didn't notice the nested subroutine problem. It hit me. At first I
  thought that maybe Apache had started to confuse connections,
  returning responses from other people's requests. I was wrong of
  course.
  
  Why didn't we notice this when we were trying the software on our
  development server? Keep reading and you will understand why.
  
  
  
  
  =head2 The Second Mystery
  
  Let's return to our original example and proceed with the second
  mystery we noticed. Why did we see inconsistent results over numerous
  reloads?
  
  That's very simple. Every time a server gets a request to process, it
  hands it over one of the children, generally in a round robin
  fashion. So if you have 10 httpd children alive, the first 10 reloads
  might seem to be correct because the effect we've just talked about
  starts to appear from the second re-invocation.  Subsequent reloads
  then return unexpected results.
  
  Moreover, requests can appear at random and children don't always run
  the same scripts.  At any given moment one of the children could have
  served the same script more times than any other, and another may
  never have run it. That's why we saw the strange behavior.
  
  Now you see why we didn't notice the problem with the user
  registration system in the example. First, we didn't look at the
  C<error_log>. (As a matter of fact we did, but there were so many
  warnings in there that we couldn't tell what were the important ones
  and what were not). Second, we had too many server children running to
  notice the problem.
  
  A workaround is to run the server as a single process. You achieve
  this by invoking the server with the C<-X> parameter (C<httpd -X>).
  Since there are no other servers (children) running, you will see the
  problem on the second reload.
  
  But before that, let the C<error_log> help you detect most of the
  possible errors--most of the warnings can become errors, so you should
  make sure to check every warning that is detected by perl, and
  probably you should write your code in such a way that no warnings
  appear in the C<error_log>. If your C<error_log> file is filled up
  with hundreds of lines on every script invocation, you will have
  difficulty noticing and locating real problems--and on a production
  server you'll soon run out of disk space if your site is popular.
  
  Of course none of the warnings will be reported if the warning
  mechanism is not turned B<On>. Refer to the section "L<Tracing
  Warnings Reports|perl/Tracing_Warnings_Reports>" to learn about
  warnings in general and to the "L<Warnings|porting/Warnings>" section
  to learn how to turn them on and off under mod_perl.
  
  
  
  
  
  
  
  
  
  
  
  
  =head1 Sometimes it Works, Sometimes it Doesn't
  
  When you start running your scripts under mod_perl, you might find
  yourself in a situation where a script seems to work, but sometimes it
  screws up. And the more it runs without a restart, the more it screws
  up. Often the problem is easily detectable and solvable. You have to
  test your script under a server running in single process mode
  (C<httpd -X>).
  
  Generally the problem is the result of using global variables. Because
  global variables don't change from one script invocation to another
  unless you change them, you can find your scripts do strange things.
  
  Let's look at three real world examples:
  
  =head2 An Easy Break-in
  
  The first example is amazing--Web Services. Imagine that you enter
  some site where you have an account, perhaps a free email
  account. Having read your own mail you decide to take a look at
  someone else's.
  
  You type in the username you want to peek at and a dummy password and
  try to enter the account. On some services this will work!!!
  
  You say, why in the world does this happen? The answer is simple:
  B<Global Variables>. You have entered the account of someone who
  happened to be served by the same server child as you. Because of
  sloppy programming, a global variable was not reset at the beginning
  of the program and voila, you can easily peek into someone else's
  email!  Here is an example of sloppy code:
  
    use vars ($authenticated);
    my $q = new CGI;
    my $username = $q->param('username');
    my $passwd   = $q->param('passwd');
    authenticate($username,$passwd);
      # failed, break out
    unless ($authenticated){
      print "Wrong passwd";
      exit;
    }
      # user is OK, fetch user's data
    show_user($username);
    
    sub authenticate{
      my ($username,$passwd) = @_;
        # some checking
      $authenticated = 1 if SOME_USER_PASSWD_CHECK_IS_OK;
    }
  
  Do you see the catch? With the code above, I can type in any valid
  username and any dummy password and enter that user's account,
  provided she has successfully entered her account before me using the
  same child process! Since C<$authenticated> is global--if it becomes 1
  once, it'll stay 1 for the remainder of the child's life!!! The
  solution is trivial--reset C<$authenticated> to 0 at the beginning of
  the program.
  
  A cleaner solution of course is not to rely on global variables, but
  rely on the return value from the function.
  
    my $q = CGI->new;
    my $username = $q->param('username');
    my $passwd   = $q->param('passwd');
    my $authenticated = authenticate($username,$passwd);
      # failed, break out
    unless ($authenticated){
      print "Wrong passwd";
      exit;
    }
      # user is OK, fetch user's data
    show_user($username);
    
    sub authenticate{
      my ($username,$passwd) = @_;
        # some checking
      return (SOME_USER_PASSWD_CHECK_IS_OK) ? 1 : 0;
    }
  
  
  Of course this example is trivial--but believe me it happens!
  
  
  
  =head2 Thinking mod_cgi
  
  Just another little one liner that can spoil your day, assuming you
  forgot to reset the C<$allowed> variable.  It works perfectly OK in
  plain mod_cgi:
  
    $allowed = 1 if $username eq 'admin';
  
  But using mod_perl, and if your system administrator with superuser
  access rights has previously used the system, anybody who is lucky
  enough to be served later by the same child which served your
  administrator will happen to gain the same rights.
  
  The obvious fix is:
  
    $allowed = $username eq 'admin' ? 1 : 0;
  
  =head2 Regular Expression Memory
  
  Another good example is usage of the B</o> regular expression
  modifier, which compiles a regular expression once, on its first
  execution, and never compiles it again. This problem can be difficult
  to detect, as after restarting the server each request you make will
  be served by a different child process, and thus the regex pattern for
  that child will be compiled afresh.  Only when you make a request that
  happens to be served by a child which has already cached the regex
  will you see the problem.  Generally you miss that. When you press
  reload, you see that it works (with a new, fresh child). Eventually it
  doesn't, because you get a child that has already cached the regex
  and won't recompile because of the B</o> modifier.
  
  An example of such a case would be:
  
    my $pat = $q->param("keyword");
    foreach( @list ) {
      print if /$pat/o;
    }
  
  To make sure you don't miss these bugs always test your CGI in
  L<single process mode|control/Running_a_Server_in_Single_Process_Mode>.
  
  To solve this particular B</o> modifier problem refer to L<Compiled
  Regular Expressions|perl/Compiled_Regular_Expressions>.
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  =head1 Script's name space
  
  Scripts under C<Apache::Registry> do not run in package C<main>, they run
  in a unique name space based on the requested URI. For example, if
  your URI is C</perl/test.pl> the package will be called
  C<Apache::ROOT::perl::test_2epl>.
  
  
  
  
  
  
  
  
  
  
  
  
  
  =head1 @INC and mod_perl
  
  The basic Perl C<@INC> behaviour is explained in section L<use(), require(),
  do(), %INC and @INC Explained|perl/use_require_do_INC_and>.
  
  When running under mod_perl, once the server is up C<@INC> is frozen
  and cannot be updated.  The only opportunity to B<temporarily> modify
  C<@INC> is while the script or the module are loaded and compiled for
  the first time.  After that its value is reset to the original
  one. The only way to change C<@INC> permanently is to modify it at
  Apache startup.
  
  Two ways to alter C<@INC> at server startup:
  
  =over
  
  =item *
  
  In the configuration file.  For example add:
  
    PerlSetEnv PERL5LIB /home/httpd/perl
  
  or
  
    PerlSetEnv PERL5LIB /home/httpd/perl:/home/httpd/mymodules
  
  Note that this setting will be ignored if you have the
  C<PerlTaintCheck> mode turned on.
  
  =item * 
  
  In the startup file directly alter the C<@INC>. For example
  
    startup.pl
    ----------
    use lib qw(/home/httpd/perl /home/httpd/mymodules);
  
  and load the startup file from the configuration file by:
  
    PerlRequire /path/to/startup.pl
  
  
  =back
  
  
  
  
  
  
  
  
  
  
  =head1 Reloading Modules and Required Files
  
  You might want to read the "L<use(), require(), do(), %INC and @INC
  Explained|perl/use_require_do_INC_and>" before you proceed with this
  section.
  
  When you develop plain CGI scripts, you can just change the code, and
  rerun the CGI from your browser. Since the script isn't cached in
  memory, the next time you call it the server starts up a new perl
  process, which recompiles it from scratch.  The effects of any
  modifications you've applied are immediately present.
  
  The situation is different with C<Apache::Registry>, since the whole
  idea is to get maximum performance from the server. By default, the
  server won't spend time checking whether any included library modules
  have been changed. It assumes that they weren't, thus saving a few
  milliseconds to stat() the source file (multiplied by however many
  modules/libraries you use() and/or require() in your script.)
  
  The only check that is done is to see whether your main script has
  been changed. So if you have only scripts which do not use() or
  require() other perl modules or packages, there is nothing to worry
  about.  If, however, you are developing a script that includes other
  modules, the files you use() or require() aren't checked for
  modification and you need to do something about that.
  
  So how do we get our mod_perl-enabled server to recognize changes in
  library modules?  Well, there are a couple of techniques:
  
  =head2 Restarting the server
  
  The simplest approach is to restart the server each time you apply
  some change to your code.  See L<Server Restarting
  techniques|control/Restarting_Techniques>.
  
  After restarting the server about 100 times, you will tire of it and
  you will look for other solutions.
  
  =head2 Using Apache::StatINC for the Development Process
  
  Help comes from the C<Apache::StatINC> module.  When Perl pulls a file
  via require(), it stores the full pathname as a value in the global
  hash C<%INC> with the file name as the key.  C<Apache::StatINC> looks
  through C<%INC> and immediately reloads any files that have been
  updated on disk.
  
  To enable this module just add two lines to C<httpd.conf>.
  
    PerlModule Apache::StatINC
    PerlInitHandler Apache::StatINC
  
  To be sure it really works, turn on debug mode on your development box
  by adding C<PerlSetVar StatINCDebug On> to your config file. You end
  up with something like this:
  
    PerlModule Apache::StatINC
    <Location /perl>
      SetHandler perl-script
      PerlHandler Apache::Registry
      Options ExecCGI
      PerlSendHeader On
      PerlInitHandler Apache::StatINC
      PerlSetVar StatINCDebug On
    </Location>
  
  Be aware that only the modules located in C<@INC> are reloaded on
  change, and you can change C<@INC> only before the server has been
  started (in the startup file).
  
  Nothing you do in your scripts and modules which are pulled in with
  require() after server startup will have any effect on C<@INC>.
  
  When you write:
  
    use lib qw(foo/bar);
  
  C<@INC> is changed only for the time the code is being parsed and
  compiled.  When that's done, C<@INC> is reset to its original
  value.
  
  To make sure that you have set C<@INC> correctly, configure
  L</perl-status 
location|debug/Apache__Status____Embedded_Interpreter_Status_Information>, fetch
  http://www.example.com/perl-status?inc and look at the bottom of the
  page, where the contents of C<@INC> will be shown.
  
  
  Notice the following trap:
  
  While "C<.>" is in C<@INC>, perl knows to require() files with
  pathnames given relative to the current (script) directory. After the
  script has been parsed, the server doesn't remember the path!
  
  So you can end up with a broken entry in C<%INC> like this:
  
    $INC{bar.pl} eq "bar.pl"
  
  If you want Apache::StatINC to reload your script--modify C<@INC> at
  server startup, or use a full path in the require() call.
  
  
  =head2 Using Apache::Reload 
  
  C<Apache::Reload> comes as a drop-in replacement for
  C<Apache::StatINC>. It provides extra functionality and better
  flexibility.
  
  If you want C<Apache::Reload> to check all the loaded modules on each
  request, you just add to I<httpd.conf>:
  
    PerlInitHandler Apache::Reload
  
  If you want to reload only specific modules when these get changed,
  you have two ways to do that. 
  
  =head3 Register Modules Implicitly
  
  The first way is to turn I<Off> the C<ReloadAll> variable, which is
  I<On> by default
  
    PerlInitHandler Apache::Reload
    PerlSetVar ReloadAll Off
  
  and add:
  
    use Apache::Reload;
  
  to every module that you want to be reloaded on change.
  
  =head3 Register Modules Explicitly
  
  The second way is to explicitly specify modules to be reloaded in
  I<httpd.conf>:
  
    PerlInitHandler Apache::Reload
    PerlSetVar ReloadModules "My::Foo My::Bar Foo::Bar::Test"
  
  Note that these are split on whitespace, but the module list B<must>
  be in quotes, otherwise Apache tries to parse the parameter list.
  
  You can register groups of modules using the metacharacter (C<*>).
  
      PerlSetVar ReloadModules "Foo::* Bar::*"
  
  In the above example all modules starting with I<Foo::> and I<Bar::>
  will become registered. This features allows you to assign the whole
  project modules tree in one pattern.
  
  =head3 Special "Touch" File
  
  You can also set a file that you can touch(1) that causes the reloads
  to be performed. If you set this, and don't touch(1) the file, the
  reloads don't happen (no matter how have you registered the modules to
  be reloaded).
  
    PerlSetVar ReloadTouchFile /tmp/reload_modules
  
  Now when you're happy with your changes, simply go to the command line
  and type:
  
    % touch /tmp/reload_modules
  
  This feature is very convenient in a production server environment,
  but compared to a full restart, the benefits of preloaded modules
  memory sharing are lost, since each child will get it's own copy of
  the reloaded modules.
  
  =head3 Caveats
  
  This module might have a problem with reloading single modules that
  contain multiple packages that all use pseudo-hashes.
  
  
  Also if you have modules loaded from directories which are not in
  C<@INC>, C<Apache::Reload> will fail to find the files, due the fact
  that C<@INC> is reset to its original value even if it gets temporary
  modified in the script. The solution is to extend C<@INC> at the
  server startup to include directories you load the files from which
  aren't in C<@INC>.
  
  For example, if you have a script which loads I<MyTest.pm> from
  I</home/stas/myproject>:
  
    use lib qw(/home/stas/myproject);
    require MyTest;
  
  C<Apache::Reload> won't find this file, unless you alter C<@INC> in
  I<startup.pl> (or I<httpd.conf>):
  
    startup.pl
    ----------
    use lib qw(/home/stas/myproject);
  
  and restart the server. Now the problem is solved.
  
  
  =head3 Availability
  
  This module is available from L<CPAN|download/CPAN_Downloads>. 
  
  
  
  =head2 Configuration Files: Writing, Dynamically Updating and Reloading
  
  Checking all the modules in B<%INC> on every request can add a large
  overhead to server response times, and you certainly would not want
  the C<Apache::StatINC> module to be enabled in your production site's
  configuration. But sometimes you want a configuration file reloaded
  when it is updated, without restarting the server.
  
  This is an especially important feature if for example you have a
  person that is allowed to modify some of the tool configuration, but
  for security reasons it's undesirable for him to telnet to the server
  to restart it.
  
  =head3 Writing Configuration Files
  
  Since we are talking about configuration files, I would like to show
  you some good and bad approaches to configuration file writing.
  
  If you have a configuration file of just a few variables, it doesn't
  really matter how you do it. But generally this is not the
  case. Configuration files tend to grow as a project grows. It's very
  relevant to projects that generate HTML files, since they tend to
  demand many easily configurable parameters, like headers, footers,
  colors and so on.
  
  So let's start with the approach that is most often taken by CGI
  scripts writers. All configuration variables are defined in a
  separate file.
  
  For example:
  
    $cgi_dir = "/home/httpd/perl";
    $cgi_url = "/perl";
    $docs_dir = "/home/httpd/docs";
    $docs_url = "/";
    $img_dir = "/home/httpd/docs/images";
    $img_url = "/images";
    ... many more config params here ...
    $color_hint   = "#777777";
    $color_warn   = "#990066";
    $color_normal = "#000000";
  
  The C<use strict;> pragma demands that all the variables be declared.
  When we want to use these variables in a mod_perl script we must
  declare them with C<use vars> in the script. (Under Perl v5.6.0
  C<our()> has replaced C<use vars>.)
  
  So we start the script with:
  
    use strict;
    use vars qw($cgi_dir $cgi_url $docs_dir $docs_url 
                ... many more config params here ....
                $color_hint  $color_warn $color_normal
               );
  
  It is a nightmare to maintain such a script, especially if not all the
  features have been coded yet.  You have to keep adding and removing
  variable names. But that's not a big deal.
  
  Since we want our code clean, we start the configuration file with
  C<use strict;> as well, so we have to list the variables with C<use
  vars> pragma here as well. A second list of variables to maintain.
  
  If you have many scripts, you may get collisions between configuration
  files. One of the best solutions is to declare packages, with unique
  names of course. For example for our configuration file we might
  declare the following package name:
  
    package My::Config;
  
  The moment you add a package declaration and think that you are done,
  you realize that the nightmare has just begun. When you have declared
  the package, you cannot just require() the file and use the variables,
  since they now belong to a different package. So you have either to
  modify all your scripts to use a fully qualified notation like
  C<$My::Config::cgi_url> instead of just C<$cgi_url> or to import the
  needed variables into any script that is going to use them.
  
  Since you don't want to do the extra typing to make the variables
  fully qualified, you'd go for importing approach. But your
  configuration package has to export them first. That means that you
  have to list all the variables again and now you have to keep at least
  three variable lists updated when you make some changes in the naming
  of the configuration variables. And that's when you have only one
  script that uses the configuration file, in the general case you have
  many of them. So now our example configuration file looks like this:
  
    package My::Config;
    use strict;
    
    BEGIN {
      use Exporter ();
    
      @My::HTML::ISA       = qw(Exporter);
      @My::HTML::EXPORT    = qw();
      @My::HTML::EXPORT_OK = qw($cgi_dir $cgi_url $docs_dir $docs_url
                                ... many more config params here ....
                                $color_hint $color_warn $color_normal);
    }
    
    use vars qw($cgi_dir $cgi_url $docs_dir $docs_url 
                ... many more config params here ....
                $color_hint  $color_warn $color_normal
               );
    
    $cgi_dir = "/home/httpd/perl";
    $cgi_url = "/perl";
    $docs_dir = "/home/httpd/docs";
    $docs_url = "/";
    $img_dir = "/home/httpd/docs/images";
    $img_url = "/images";
    ... many more config params here ...
    $color_hint   = "#777777";
    $color_warn   = "#990066";
    $color_normal = "#000000";
  
  And in the code:
  
    use strict;
    use My::Config qw($cgi_dir $cgi_url $docs_dir $docs_url 
                      ... many more config params here ....
                      $color_hint  $color_warn $color_normal
                     );
    use vars       qw($cgi_dir $cgi_url $docs_dir $docs_url 
                      ... many more config params here ....
                      $color_hint  $color_warn $color_normal
                     );
  
  This approach is especially bad in the context of mod_perl, since
  exported variables add a memory overhead. The more variables exported
  the more memory you use. If we multiply this overhead by the number of
  servers we are going to run, we get a pretty big number which could be
  used to run a few more servers instead.
  
  As a matter of fact things aren't so bad. You can group your
  variables, and call the groups by special names called tags, which can
  later be used as arguments to the import() or use() calls. You are
  probably familiar with:
  
    use CGI qw(:standard :html);
  
  We can implement this quite easily, with the help of export_ok_tags()
  from C<Exporter>. For example:
  
    BEGIN {
      use Exporter ();
      use vars qw( @ISA @EXPORT @EXPORT_OK %EXPORT_TAGS);
      @ISA         = qw(Exporter);
      @EXPORT      = qw();
      @EXPORT_OK   = qw();
      
      %EXPORT_TAGS = (
        vars => [qw($fname $lname)],
        subs => [qw(reread_conf untaint_path)],
      );
      Exporter::export_ok_tags('vars');
      Exporter::export_ok_tags('subs');
    }
  
  You export subroutines exactly like variables, since what's actually
  being exported is a symbol. The definition of these subroutines is not
  shown here.
  
  Notice that we didn't use export_tags(), as it exports the variables
  automatically without the user asking for them in first place, which is
  considered bad style. If a module automatically exports variables
  with export_tags() you can stop this by not exporting at all:
  
    use My::Config ();
  
  In your code you can now write:
  
    use My::Config qw(:subs :vars);
  
  Groups of group tags:
  
  The C<:all> tag from C<CGI.pm> is a group tag of all other groups. It
  will require a little more effort to implement, but you can always save
  time by looking at the solution in C<CGI.pm>'s code. It's just a
  matter of a little code to expand all the groups recursively.
  
  After going through the pain of maintaining a list of variables in a
  big project with a huge configuration file (more than 100 variables)
  and many files actually using them, I came up with a much simpler
  solution: keeping all the variables in a single hash, which is built
  from references to other anonymous scalars, arrays and hashes.
  
  Now my configuration file looks like this:
  
    package My::Config;
    use strict;
    
    BEGIN {
      use Exporter ();
    
      @My::Config::ISA       = qw(Exporter);
      @My::Config::EXPORT    = qw();
      @My::Config::EXPORT_OK = qw(%c);
    }
    
    use vars qw(%c);
    
    %c = 
      (
       dir => {
             cgi  => "/home/httpd/perl",
             docs => "/home/httpd/docs",
             img  => "/home/httpd/docs/images",
            },
       url => {
             cgi  => "/perl",
             docs => "/",
             img  => "/images",
            },
       color => {
               hint   => "#777777",
               warn   => "#990066",
               normal => "#000000",
              },
      );
  
  Good perl style suggests keeping a comma at the end of lists. That's
  because additional items tend to be added to the end of the list. If
  you keep that last comma in place, you don't have to remember to add
  one when you add a new item.
  
  So now the script looks like this:
  
    use strict;
    use My::Config qw(%c);
    use vars       qw(%c)
    print "Content-type: text/plain\r\n\r\n";
    print "My url docs root: $c{url}{docs}\n";
  
  Do you see the difference? The whole mess has gone, there is only one
  variable to worry about.
  
  There is one small downside to taking this approach:
  auto-vivification. For example, if we wrote C<$c{url}{doc}> by mistake,
  perl would silently create this element for us with the value
  I<undef>. When we C<use strict;> Perl will tell us about any
  misspelling of this kind for a simple scalar, but this check is not
  performed for hash elements. This puts the onus of responsibility back
  on us since we must take greater care. A possible solution to this is
  to use pseudo-hashes, but they are still considered experimental so we
  won't cover them here.
  
  The benefits of the hash approach are significant and we can make do
  even better. I would like to get rid of the C<Exporter> stuff
  completely. I remove all the exporting code so my config file now
  looks like:
  
    package My::Config;
    use strict;
    use vars qw(%c);
    
    %c = 
      (
       dir => {
             cgi  => "/home/httpd/perl",
             docs => "/home/httpd/docs",
             img  => "/home/httpd/docs/images",
            },
       url => {
             cgi  => "/perl",
             docs => "/",
             img  => "/images",
            },
       color => {
               hint   => "#777777",
               warn   => "#990066",
               normal => "#000000",
              },
      );
  
  And the code:
  
    use strict;
    use My::Config ();
    print "Content-type: text/plain\r\n\r\n";
    print "My url docs root: $My::Config::c{url}{docs}\n";
  
  Since we still want to save lots of typing, and since now we need to
  use a fully qualified notation like C<$My::Config::c{url}{docs}>,
  let's use the magical Perl aliasing feature. I'll modify the code to
  be:
  
    use strict;
    use My::Config ();
    use vars qw(%c);
    *c = \%My::Config::c;
    print "Content-type: text/plain\r\n\r\n";
    print "My url docs root: $c{url}{docs}\n";
  
  I have aliased the C<*c> glob with C<\%My::Config::c>, a reference to
  a hash. From now on, C<%My::Config::c> and C<%c> are the same
  hash and you can read from or modify either of them.
  
  Just one last little point. Sometimes you see a lot of redundancy in
  the configuration variables, for example:
  
    $cgi_dir  = "/home/httpd/perl";
    $docs_dir = "/home/httpd/docs";
    $img_dir  = "/home/httpd/docs/images";
  
  Now if you want to move the base path C<"/home/httpd"> into a new
  place, it demands lots of typing. Of course the solution is:
  
    $base     = "/home/httpd";
    $cgi_dir  = "$base/perl";
    $docs_dir = "$base/docs";
    $img_dir  = "$docs_dir/images";
  
  You cannot do the same trick with a hash, since you cannot refer to
  its values before the definition is finished. So this wouldn't work:
  
    %c =
      (
       base => "/home/httpd",
       dir => {
             cgi  => "$c{base}/perl",
             docs => "$c{base}/docs",
             img  => "$c{base}{docs}/images",
            },
      );
  
  But nothing stops us from adding additional variables, which are
  lexically scoped with my(). The following code is correct.
  
    my $base = "/home/httpd";
    %c =
      (
       dir => {
             cgi  => "$base/perl",
             docs => "$base/docs",
             img  => "$base/docs/images",
            },
      );
  
  
  You have just learned how to make configuration files easily
  maintainable, and how to save memory by avoiding the export of
  variables into a script's namespace.
  
  
  =head3 Reloading Configuration Files
  
  First, lets look at a simple case, when we just have to look after a
  simple configuration file like the one below. Imagine a script that
  tells you who is the patch pumpkin of the current Perl release.
  
  Sidenote: I<Pumpkin> A humorous term for the token (notional or real)
  that gives its possessor (the "pumpking" or the "pumpkineer")
  exclusive access to something, e.g. applying patches to a master copy
  of some source (for which the token is called the "patch pumpkin").
  
    use CGI ();
    use strict;
    
    my $fname = "Larry";
    my $lname = "Wall";
    my $q = CGI->new;
    
    print $q->header(-type=>'text/html');
    print $q->p("$fname $lname holds the patch pumpkin" .
                 "for this Perl release.");
  
  The script has a hardcoded value for the name. It's very simple:
  initialize the CGI object, print the proper HTTP header and tell the
  world who is the current patch pumpkin.
  
  When the patch pumpkin changes we don't want to modify the
  script. Therefore, we put the C<$fname> and C<$lname> variables into a
  configuration file.
  
    $fname = "Gurusamy";
    $lname = "Sarathy";
    1;
  
  Please note that there is no package declaration in the above file, so
  the code will be evaluated in the caller's package or in the C<main::>
  package if none was declared. This means that the variables C<$fname>
  and C<$lname> will override (or initialize if they weren't yet) the
  variables with the same names in the caller's namespace. This works
  for global variables only--you cannot update variables defined
  lexically (with my()) using this technique.
  
  You have started the server and everything is working properly. After
  a while you decide to modify the configuration. How do you let your
  running server know that the configuration was modified without
  restarting it? Remember we are in production and server restarting can
  be quite expensive for us.  One of the simplest solutions is to poll
  the file's modification time by calling stat() before the script
  starts to do real work.  If we see that the file was updated, we force
  a reconfiguration of the variables located in this file. We will call
  the function that reloads the configuration reread_conf() and have it
  accept a single argument, which is the relative path to the
  configuration file.
  
  C<Apache::Registry> calls a chdir() to the script's directory before
  it starts the script's execution.  So if your CGI script is invoked
  under the C<Apache::Registry> handler you can put the configuration
  file in the same directory as the script. Alternatively you can put
  the file in a directory below that and use a path relative to the
  script directory. You have to make sure that the file will be found,
  somehow.  Be aware that do() searches the libraries in the directories
  in C<@INC>.
  
    use vars qw(%MODIFIED);
    sub reread_conf{
        my $file = shift;
        return unless defined $file;
        return unless -e $file and -r _;
        my $mod = -M _;
        unless (exists $MODIFIED{$file} and $MODIFIED{$file} == $mod) {
            my $result;
            unless ($result = do $file) {
                warn "couldn't parse $file: $@" if $@;
                warn "couldn't do $file: $!"    unless defined $result;
                warn "couldn't run $file"       unless         $result;
            }
            $MODIFIED{$file} = $mod; # Update the MODIFICATION times
        }
    } # end of reread_conf
  
  Notice that we use the C<==> comparison operator when checking file's
  modification timestamp, because all we want to know whether the file
  was changed or not.
  
  When the require(), use() and do() operators successfully return, the
  file that was passed as an argument is inserted into C<%INC> (the key
  is the name of the file and the value the path to it). Specifically,
  when Perl sees require() or use() in the code, it first tests C<%INC>
  to see whether the file is already there and thus loaded. If the test
  returns true, Perl saves the overhead of code re-reading and
  re-compiling; however calling do() will (re)load regardless.
  
  You generally don't notice with plain perl scripts, but in mod_perl
  it's used all the time; after the first request served by a process
  all the files loaded by require() stay in memory. If the file is
  preloaded at server startup, even the first request doesn't have
  the loading overhead.
  
  We use do() to reload the code in this file and not require() because
  while do() behaves almost identically to require(), it reloads the
  file unconditionally. If do() cannot read the file, it returns
  C<undef> and sets C<$!> to report the error.  If do() can read the
  file but cannot compile it, it returns C<undef> and sets an error
  message in C<$@>. If the file is successfully compiled, do() returns
  the value of the last expression evaluated.
  
  The configuration file can be broken if someone has incorrectly
  modified it. We don't want the whole service that uses that file to be
  broken, just because of that. We trap the possible failure to do() the
  file and ignore the changes, by the resetting the modification
  time. If do() fails to load the file it might be a good idea to send
  an email to the system administrator about the problem.
  
  Notice however, that since do() updates C<%INC> like require() does,
  if you are using C<Apache::StatINC> it will attempt to reload this
  file before the reread_conf() call.  So if the file wouldn't compile,
  the request will be aborted.  C<Apache::StatINC> shouldn't be used in
  production (because it slows things down by stat()'ing all the files
  listed in C<%INC>) so this shouldn't be a problem.
  
  Note that we assume that the entire purpose of this function is to
  reload the configuration if it was changed.  This is fail-safe,
  because if something goes wrong we just return without modifying the
  server configuration. The script should not be used to initialize the
  variables on its first invocation.  To do that, you would need to
  replace each occurrence of return() and warn() with die(). If you do
  that, take a look at the section "L<Redirecting Errors to the Client
  instead of error_log|snippets/Redirecting_Errors_to_the_Client>".
  
  I used the above approach when I had a huge configuration file that
  was loaded only at server startup, and another little configuration
  file that included only a few variables that could be updated by hand
  or through the web interface.  Those variables were initialized in the
  main configuration file.  If the webmaster breaks the syntax of this
  dynamic file while updating it by hand, it won't affect the main
  (write-protected) configuration file and so stop the proper execution
  of the programs. Soon we will see a simple web interface which allows
  us to modify the configuration file without actually breaking it.
  
  A sample script using the presented subroutine would be:
  
    use vars qw(%MODIFIED $fname $lname);
    use CGI ();
    use strict;
    
    my $q = CGI->new;
    print $q->header(-type=>'text/plain');
    my $config_file = "./config.pl";
    reread_conf($config_file);
    print $q->p("$fname $lname holds the patch pumpkin" .
                 "for this Perl release.");
    
    sub reread_conf{
        my $file = shift;
        return unless defined $file;
        return unless -e $file and -r _;
        my $mod = -M _;
        unless ($MODIFIED{$file} and $MODIFIED{$file} == $mod) {
            my $result;
            unless ($result = do $file) {
                warn "couldn't parse $file: $@" if $@;
                warn "couldn't do $file: $!"    unless defined $result;
                warn "couldn't run $file"       unless $result;
            }
            $MODIFIED{$file} = $mod; # Update the MODIFICATION times
        }
    } # end of reread_conf
  
  Remember that you should be using C<(stat $file)[9]> instead of C<-M
  $file> if you are modifying the C<$^T> variable. In some of my
  scripts, I reset C<$^T> to the time of the script invocation with
  C<"$^T = time()">. That way I can perform C<-M> and the similar
  (C<-A>, C<-C>) file status tests relative to the script invocation
  time, and not the time the process was started.
  
  If your configuration file is more sophisticated and it declares a
  package and exports variables, the above code will work just as
  well. Even if you think that you will have to import() variables
  again, when do() recompiles the script the originally imported
  variables get updated with the values from the reloaded code.
  
  =head3 Dynamically updating configuration files
  
  The CGI script below allows a system administrator to dynamically
  update a configuration file through the web interface. Combining this
  with the code we have just seen to reload the modified files, you get
  a system which is dynamically reconfigurable without needing to
  restart the server. Configuration can be performed from any machine
  having just a web interface (a simple browser connected to the
  Internet).
  
  Let's say you have a configuration file like this:
  
    package MainConfig;
    
    use strict;
    use vars qw(%c);
    
    %c = (
          name     => "Larry Wall",
          release  => "5.000",
          comments => "Adding more ways to do the same thing :)",
    
          other    => "More config values",
    
          hash     => { foo => "ouch",
                        bar => "geez",
                      },
    
          array    => [qw( a b c)],
    
         );
  
  You want to make the variables C<name>, C<release> and C<comments>
  dynamically configurable. You want to have a web interface with an input
  form that allows you to modify these variables. Once modified you want
  to update the configuration file and propagate the changes to all the
  currently running processes. Quite a simple task.
  
  Let's look at the main stages of the implementation. Create a form
  with preset current values of the variables. Let the administrator
  modify it and submit the changes. Validate the submitted information
  (numeric fields should carry numbers, literals--words, etc). Update
  the configuration file. Update the modified value in the memory of the
  current process. Present the form as before but with updated fields if
  any.
  
  The only part that seems to be complicated to implement is a
  configuration file update, for a couple of reasons. If updating the
  file breaks it, the whole service won't work. If the file is very
  big and includes comments and complex data structures, parsing the
  file can be quite a challenge.
  
  So let's simplify the task.  If all we want is to update a few
  variables, why don't we create a tiny configuration file with just those
  variables? It can be modified through the web interface and overwritten
  each time there is something to be changed. This way we don't have
  to parse the file before updating it. If the main configuration file
  is changed we don't care, we don't depend on it any more.
  
  The dynamically updated variables are duplicated, they will be in the
  main file and in the dynamic file. We do this to simplify
  maintenance. When a new release is installed the dynamic
  configuration file won't exist at all. It will be created only after
  the first update. As we just saw, the only change in the main code is
  to add a snippet to load this file if it exists and was changed.
  
  This additional code must be executed after the main configuration
  file has been loaded. That way the updated variables will override
  the default values in the main file.
  
  META: extend on the comments:
  
    # remember to run this code in taint mode
    
    use strict;
    use vars qw($q %c $dynamic_config_file %vars_to_change %validation_rules);
    
    use CGI ();
    
    use lib qw(.);
    use MainConfig ();
    *c = \%MainConfig::c;
    
    $dynamic_config_file = "./config.pl";
    
    # load the dynamic configuration file if it exists, and override the
    # default values from the main configuration file
    do $dynamic_config_file if -e $dynamic_config_file and -r _;
    
    # fields that can be changed and their titles
    %vars_to_change =
      (
       'name'     => "Patch Pumpkin's Name",
       'release'  => "Current Perl Release",
       'comments' => "Release Comments",
      );
    
    %validation_rules =
      (
       'name'     => sub { $_[0] =~ /^[\w\s\.]+$/;   },
       'release'  => sub { $_[0] =~ /^\d+\.[\d_]+$/; },
       'comments' => sub { 1;                        },
      );
    
    $q = CGI->new;
    print $q->header(-type=>'text/html'), 
          $q->start_html();
    
    my %updates = ();
    
    # We always rewrite the dynamic config file, so we want all the
    # vars to be passed, but to save time we will only do checking
    # of vars that were changed.  The rest will be retrieved from
    # the 'prev_foo' values.
    foreach (keys %vars_to_change) {
      # copy var so we can modify it
      my $new_val = $q->param($_) || '';
    
      # strip a possible ^M char (DOS/WIN)
      $new_val =~ s/\cM//g;
    
      # push to hash if was changed
      $updates{$_} = $new_val
        if defined $q->param("prev_".$_) 
           and $new_val ne $q->param("prev_".$_);
    }
    
    # Note that we cannot trust the previous values of the variables
    # since they were presented to the user as hidden form variables,
    # and the user can mangle those. We don't care: it cannot do any
    # damage, as we verify each variable by rules which we define.
  
    # Process if there is something to process. Will be not called if
    # it's invoked a first time to display the form or when the form
    # was submitted but the values weren't modified (we know that by
    # comparing with the previous values of the variables, which are
    # the hidden fields in the form)
    
    # process and update the values if valid
    process_change_config(%updates) if %updates;
    
    # print the update form
    conf_modification_form();
    
    # update the config file but first validate that the values are correct ones
    #########################
    sub process_change_config{
      my %updates = @_;
    
        # we will list here all the malformatted vars
      my %malformatted = ();
    
      print $q->b("Trying to validate these values<BR>");
      foreach (keys %updates) {
        print "<DT><B>$_</B> => <PRE>$updates{$_}</PRE>";
    
        # now we have to handle each var to be changed very carefully
        # since this file goes immediately into production!
        $malformatted{$_} = delete $updates{$_}
          unless $validation_rules{$_}->($updates{$_});
    
      } # end of foreach (keys %updates)
    
      # print warnings if there are any invalid changes
      print $q->hr,
        $q->p($q->b(qq{Warning! These variables were changed
                   but found malformed, thus the original
                   values will be preserved.})
         ),
        join(",<BR>",
         map { $q->b($vars_to_change{$_}) . " : $malformatted{$_}\n"
             } keys %malformatted)
          if %malformatted;
    
      # Now complete the vars that weren't changed from the
      # $q->param('prev_var') values
      map { $updates{$_} = $q->param('prev_'.$_) unless exists $updates{$_}
          } keys %vars_to_change;
    
      # Now we have all the data that should be written into the dynamic
      # config file
    
        # escape single quotes "'" while creating a file
      my $content = join "\n",
        map { $updates{$_} =~ s/(['\\])/\\$1/g;
          '$c{' . $_ . "}  =  '" . $updates{$_} . "';\n"
        } keys %updates;
    
        # now add '1;' to make require() happy
      $content .= "\n1;";
    
        # keep the dummy result in $res so it won't complain
      eval {my $res = $content};
      if ($@) {
        print qq{Warning! Something went wrong with config file
             generation!<P> The error was : <BR><PRE>$@</PRE>};
        return;
      }
    
      print $q->hr;
    
        # overwrite the dynamic config file
      use Symbol ();
      my $fh = Symbol::gensym();
      open $fh, ">$dynamic_config_file.bak"
        or die "Can't open $dynamic_config_file.bak for writing :$! \n";
      flock $fh,2; # exclusive lock
      seek $fh,0,0;     # rewind to the start
      truncate $fh, 0; # the file might shrink!
         print $fh $content;
      close $fh;
    
        # OK, now we make a real file
      rename "$dynamic_config_file.bak",$dynamic_config_file
        or die "Failed to rename: $!";
    
        # rerun it to update variables in the current process! Note that
        # it won't update the variables in other processes. Special
        # code that watches the timestamps on the config file will do this
        # work for each process. Since the next invocation will update the
        # configuration anyway, why do we need to load it here? The reason
        # is simple: we are going to fill the form's input fields with
        # the updated data.
      do $dynamic_config_file;
    
    } # end sub process_change_config
    
    ##########################
    sub conf_modification_form{
    
      print $q->center($q->h3("Update Form"));
    
      print $q->hr,
        $q->p(qq{This form allows you to dynamically update the current
             configuration. You don\'t need to restart the server in
             order for changes to take an effect}
             );
    
        # set the previous settings in the form's hidden fields, so we
        # know whether we have to do some changes or not
      map {$q->param("prev_$_",$c{$_}) } keys %vars_to_change;
    
        # rows for the table, go into the form
      my @configs = ();
    
        # prepare one textfield entries
      push @configs,
        map {
          $q->td(
             $q->b("$vars_to_change{$_}:"),
            ),
          $q->td(
             $q->textfield(-name      => $_,
                           -default   => $c{$_},
                           -override  => 1,
                           -size      => 20,
                           -maxlength => 50,
                          )
            ),
          } qw(name release);
    
        # prepare multiline textarea entries
      push @configs,
        map {
          $q->td(
             $q->b("$vars_to_change{$_}:"),
            ),
          $q->td(
             $q->textarea(-name    => $_,
                          -default => $c{$_},
                          -override  => 1,
                          -rows    => 10,
                          -columns => 50,
                          -wrap    => "HARD",
                          )
            ),
          } qw(comments);
    
      print $q->startform('POST',$q->url),"\n",
        $q->center($q->table(map {$q->Tr($_),"\n",} @configs),
                   $q->submit('','Update!'),"\n",
                  ),
        map ({$q->hidden("prev_".$_, $q->param("prev_".$_))."\n" }
             keys %vars_to_change), # hidden previous values
        $q->br,"\n",
        $q->endform,"\n",
        $q->hr,"\n",
        $q->end_html;
    
    } # end sub conf_modification_form
  
  
  Once updated the script generates a file like:
  
    $c{release}  =  '5.6';
    
    $c{name}  =  'Gurusamy Sarathy';
    
    $c{comments}  =  'Perl rules the world!';
    
    1;
  
  
  
  =head2 Reloading handlers
  
  If you want to reload a perlhandler on each invocation, the following
  trick will do it:
  
    PerlHandler "sub { do 'MyTest.pm'; MyTest::handler(shift) }"
  
  do() reloads C<MyTest.pm> on every request.
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  =head1 Name collisions with Modules and libs
  
  This section requires an in-depth understanding of L<use(), require(),
  do(), %INC and @INC |perl/use_require_do_INC_and>.
  
  To make things clear before we go into details: each child process has
  its own C<%INC> hash which is used to store information about its
  compiled modules. The keys of the hash are the names of the modules
  and files passed as arguments to require() and use(). The values are
  the full or relative paths to these modules and files.
  
  Suppose we have C<my-lib.pl> and C<MyModule.pm> both located at
  C</home/httpd/perl/my/>.
  
  =over 4
  
  =item *
  
  C</home/httpd/perl/my/> is in C<@INC> at server startup.
  
    require "my-lib.pl";
    use MyModule.pm;
    print $INC{"my-lib.pl"},"\n";
    print $INC{"MyModule.pm"},"\n";
  
  prints:
  
    /home/httpd/perl/my/my-lib.pl
    /home/httpd/perl/my/MyModule.pm
  
  Adding C<use lib>:
  
    use lib qw(.);
    require "my-lib.pl";
    use MyModule.pm;
    print $INC{"my-lib.pl"},"\n";
    print $INC{"MyModule.pm"},"\n";
  
  prints:
  
    my-lib.pl
    MyModule.pm
  
  =item *
  
  C</home/httpd/perl/my/> isn't in C<@INC> at server startup.
  
    require "my-lib.pl";
    use MyModule.pm;
    print $INC{"my-lib.pl"},"\n";
    print $INC{"MyModule.pm"},"\n";
  
  wouldn't work, since perl cannot find the modules.
  
  Adding C<use lib>:
  
    use lib qw(.);
    require "my-lib.pl";
    use MyModule.pm;
    print $INC{"my-lib.pl"},"\n";
    print $INC{"MyModule.pm"},"\n";
  
  prints:
  
    my-lib.pl
    MyModule.pm
  
  =back
  
  
  Let's look at three scripts with faults related to name space.  For the 
  following discussion we will consider just one individual child process.
  
  =over 4
  
  =item Scenario 1
  
  First, You can't have two identical module names running on the
  same server!  Only the first one found in a use() or require()
  statement will be compiled into the package, the request for the other
  module will be skipped, since the server will think that it's already
  compiled.  This is a direct result of using C<%INC>, which has keys
  equal to the names of the modules.  Two identical names will refer to
  the same key in the hash.  (Refer to the section 'L<Looking inside the
  server|debug/Looking_inside_the_server>' to find out how you can know
  what is loaded and where.)
  
  
  
  So if you have two different C<Foo> modules in two different
  directories and two scripts C<script1.pl> and C<script2.pl>, placed
  like this:
  
    ./tool1/Foo.pm
    ./tool1/tool1.pl
    ./tool2/Foo.pm
    ./tool2/tool2.pl
  
  
  Where some sample code could be:
  
    ./tool1/tool1.pl
    ----------------
    use Foo;
    print "Content-type: text/plain\r\n\r\n";
    print "I'm Script number One\n";
    foo();
  
    ./tool1/Foo.pm
    --------------
    sub foo{
      print "<B>I'm Tool Number One!</B>\n";
    }
    1;
  
    ./tool2/tool2.pl
    ----------------
    use Foo;
    print "Content-type: text/plain\r\n\r\n";
    print "I'm Script number Two\n";
    foo();
  
    ./tool2/Foo.pm
    --------------
    sub foo{
      print "<B>I'm Tool Number Two!</B>\n";
    }
    1;
  
  Both scripts call C<use Foo;>. Only the first one called will know
  about C<Foo>. When you call the second script it will not know about
  C<Foo> at all--it's like you've forgotten to write C<use Foo;>. Run
  the server in L<single server
  mode|control/Running_a_Server_in_Single_Proce> to detect this kind of
  bug immediately.
  
  You will see the following in the error_log file:
  
    Undefined subroutine
    &Apache::ROOT::perl::tool2::tool2_2epl::foo called at
    /home/httpd/perl/tool2/tool2.pl line 4.
  
  =item Scenario 2
  
  If the files do not declare a package, the above is true for libraries
  (i.e. I<my-lib.pl">) you require() as well:
  
  Suppose that you have a directory structure like this:
  
    ./tool1/config.pl
    ./tool1/tool1.pl
    ./tool2/config.pl
    ./tool2/tool2.pl
  
  and both scripts contain:
  
    use lib qw(.);
    require "config.pl";
  
  while I<./tool1/config.pl> can be something like this:
  
    $foo = 0;
    1;
  
  and I<./tool2/config.pl>:
  
    $foo = 1;
    1;
  
  The second scenario is not different from the first, there is almost
  no difference between use() and require() if you don't have to import
  some symbols into a calling script.  Only the first script served will
  actually do the require(), for the same reason as the example above.
  C<%INC> already includes the key I<"config.pl">!
  
  =item Scenario 3
  
  It is interesting that the following scenario will fail too!
  
    ./tool/config.pl
    ./tool/tool1.pl
    ./tool/tool2.pl
  
  where C<tool1.pl> and C<tool2.pl> both require() the B<same>
  C<config.pl>.
  
  =back
  
  
  There are three solutions for this:
  
  =over
  
  =item Solution 1
  
  The first two faulty scenarios can be solved by placing your library
  modules in a subdirectory structure so that they have different path
  prefixes. The file system layout will be something like:
  
    ./tool1/Tool1/Foo.pm
    ./tool1/tool1.pl
    ./tool2/Tool2/Foo.pm
    ./tool2/tool2.pl
  
  And modify the scripts:
  
    use Tool1::Foo;
    use Tool2::Foo;
  
  For require() (scenario number 2) use the following:
  
    ./tool1/tool1-lib/config.pl
    ./tool1/tool1.pl
    ./tool2/tool2-lib/config.pl
    ./tool2/tool2.pl
  
  And each script contains respectively:
  
    use lib qw(.);
    require "tool1-lib/config.pl";
  
    use lib qw(.);
    require "tool2-lib/config.pl";
  
  This solution isn't good, since while it might work for you now, if
  you add another script that wants to use the same module or
  C<config.pl> file, it would fail as we saw in the third scenario.
  
  Let's see some better solutions.
  
  =item Solution 2
  
  Another option is to use a full path to the script, so it will be used
  as a key in C<%INC>;
  
    require "/full/path/to/the/config.pl";
  
  This solution solves the problem of the first two scenarios. I was
  surprised that it worked for the third scenario as well!
  
  With this solution you lose some portability. If you move the tool
  around in the file system you will have to change the base directory
  or write some additional script that will automatically update the
  hardcoded path after it was moved. Of course you will have to remember
  to invoke it.
  
  =item Solution 3
  
  Make sure you read all of this solution.
  
  Declare a package name in the required files!  It should be unique in
  relation to the rest of the package names you use.  C<%INC> will then
  use the unique package name for the key. It's a good idea to use at
  least two-level package names for your private modules,
  e.g. C<MyProject::Carp> and not C<Carp>, since the latter will collide
  with an existing standard package.  Even though a
  package may not exist in the standard distribution now, a package may
  come along in a later distribution which collides with a name you've
  chosen. Using a two part package name will help avoid this problem.
  
  Even a better approach is to use three level naming, like
  C<CompanyName::ProjectName::Module>, which is most unlikely to have
  conflicts with later Perl releases. Foresee problems like this and
  save yourself future trouble.
  
  What are the implications of package declaration?
  
  Without package declarations, it is very convenient to use() or
  require() files because all the variables and subroutines are part of
  the C<main::> package.  Any of them can be used as if they are part of
  the main script. With package declarations things are more awkward.
  You have to use the C<Package::function()> method to call a subroutine
  from C<Package> and to access a global variable C<$foo> inside the
  same package you have to write C<$Package::foo>.
  
  Lexically defined variables, those declared with my() inside C<Package>
  will be inaccessible from outside the package.
  
  You can leave your scripts unchanged if you import the names of the
  global variables and subroutines into the namespace of package
  B<main::> like this:
  
    use Module qw(:mysubs sub_b $var1 :myvars);
  
  You can export both subroutines and global variables.  Note however
  that this method has the disadvantage of consuming more memory for the
  current process.
  
  See C<perldoc Exporter> for information about exporting other
  variables and symbols.
  
  This completely covers the third scenario. When you use different
  module names in package declarations, as explained above, you cover
  the first two as well.
  
  =back
  
  See also the C<perlmodlib> and C<perlmod> manpages.
  
  From the above discussion it should be clear that you cannot run
  development and production versions of the tools using the same apache
  server!  You have to run a separate server for each. They can be on the
  same machine, but the servers will use different ports.
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  =head1 More package name related issues
  
  If you have the following:
  
    PerlHandler Apache::Work::Foo
    PerlHandler Apache::Work::Foo::Bar
  
  And you make a request that pulls in C<Apache/Work/Foo/Bar.pm> first,
  then the C<Apache::Work::Foo> package gets defined, so mod_perl does
  not try to pull in C<Apache/Work/Foo.pm>
  
  
  
  
  
  
  
  
  
  
  
  
  =head1  __END__ and __DATA__ tokens
  
  C<Apache::Registry> scripts cannot contain C<__END__> or C<__DATA__>
  tokens.
  
  Why? Because C<Apache::Registry> scripts are being wrapped into a
  subroutine called C<handler>, like the script at URI C</perl/test.pl>:
  
    print "Content-type: text/plain\r\n\r\n";
    print "Hi";
  
  When the script is being executed under C<Apache::Registry> handler,
  it actually becomes:
  
    package Apache::ROOT::perl::test_2epl;
    use Apache qw(exit);
    sub handler {
      print "Content-type: text/plain\r\n\r\n";
      print "Hi";
    }
  
  So if you happen to put an C<__END__> tag, like:
  
    print "Content-type: text/plain\r\n\r\n";
    print "Hi";
    __END__
    Some text that wouldn't be normally executed
  
  it will be turned into:
  
    package Apache::ROOT::perl::test_2epl;
    use Apache qw(exit);
    sub handler {
      print "Content-type: text/plain\r\n\r\n";
      print "Hi";
      __END__
      Some text that wouldn't be normally executed
    }
  
  and you try to execute this script, you will receive the following
  error:
  
    Missing right bracket at .... line 4, at end of line
  
  Perl cuts everything after the C<__END__> tag. The same applies to
  the C<__DATA__> tag.
  
  Also, remember that whatever applies to C<Apache::Registry> scripts,
  in most cases applies to C<Apache::PerlRun> scripts.
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  =head1 Output from system calls
  
  The output of C<system()>, C<exec()>, and C<open(PIPE,"|program")> calls
  will not be sent to the browser unless your Perl was configured with
  C<sfio>.
  
  You can use backticks as a possible workaround:
  
    print `command here`;
  
  But you're throwing performance out the window either way. It's best
  not to fork at all if you can avoid it. See the "L<Forking or
  Executing subprocesses from
  mod_perl|performance/Forking_and_Executing_Subprocess>" section
  to learn about implications of forking.
  
  Also read about 
L<Apache::SubProcess|porting/CGI_to_mod_perl_Porting__mod_perl_Coding_guidelines__>
 for
  overridden system() and exec() implementations that work with mod_perl.
  
  
  
  
  
  
  
  
  
  
  =head1 Using format() and write()
  
  The interface to filehandles which are linked to variables with Perl's
  tie() function is not yet complete. The format() and write() functions
  are missing. If you configure Perl with C<sfio>, write() and format()
  should work just fine.
  
  Otherwise you could use sprintf() to replace format(): C<##.##>
  becomes C<%2.2f> and C<####.##> becomes C<%4.2f>.
  
  Pad all strings with S<(" " x 80)> before using, and set their length
  with: S<%.25s> for a max 25 char string. Or prefix the string with
  S<(" " x 80)> for right-justifying.
  
  Another alternative is to use the C<Text::Reform> module.
  
  
  
  
  =head1 Terminating requests and processes, the exit() and child_terminate() 
functions
  
  Perl's C<exit()> built-in function (all versions prior to 5.6) cannot
  be used in mod_perl scripts.  Calling it causes the mod_perl process
  to exit (which defeats the purpose of using mod_perl).  The
  C<Apache::exit()> function should be used instead.  Starting from Perl
  version 5.6 mod_perl will override exit() behind the scenes, using
  C<CORE::GLOBAL::>, a new I<magical> package.
  
  You might start your scripts by overriding the exit() subroutine (if
  you use C<Apache::exit()> directly, you will have a problem testing
  the script from the shell, unless you put C<use Apache ();> into your
  code.) I use the following code:
  
    use constant IS_MODPERL => $ENV{MOD_PERL};
    use subs qw(exit);
    # Select the correct exit function
    *exit = IS_MODPERL ? \&Apache::exit : sub { CORE::exit };
  
  Now the correct C<exit()> is always chosen, whether the script is
  running under mod_perl, ordinary CGI or from the shell. Notice that
  since we are using the constant pragma, there is no runtime overhead
  to select one of the code references, since C<IS_MODPERL> constant is
  folded, that block is optimized away at compile time outside of
  mod_perl.
  
  Note that if you run the script under C<Apache::Registry>, B<The
  Apache function C<exit()> overrides the Perl core built-in
  function>. While you see exit() listed in the C<@EXPORT_OK> list of
  the Apache package, C<Apache::Registry> does something you don't see
  and imports this function for you. This means that if your script is
  running under the C<Apache::Registry> handler you don't have to worry
  about exit().  The same applies to C<Apache::PerlRun>.
  
  If you use C<CORE::exit()> in scripts running under mod_perl, the
  child will exit, but neither a proper exit nor logging will happen on
  the way. C<CORE::exit()> cuts off the server's legs.
  
  Note that C<Apache::exit(Apache::Constants::DONE)> will cause the
  server to exit gracefully, completing the logging functions and
  protocol requirements etc. ( Apache::Constants::DONE == -2,
  Apache::Constants::OK == 0.)
  
  If you need to shut down the child cleanly after the request was
  completed, use the C<$r-E<gt>child_terminate> method. You can call it
  anywhere in the code, and not just at the "end". This sets the value
  of the C<MaxRequestsPerChild> configuration variable to 1 and clears
  the C<keepalive> flag. After the request is serviced, the current
  connection is broken, because of the C<keepalive> flag, and the parent
  tells the child to cleanly quit, because C<MaxRequestsPerChild> is
  smaller than the number of requests served.
  
  In an C<Apache::Registry> script you would do:
  
    Apache->request->child_terminate;
  
  or in httpd.conf:
  
    PerlFixupHandler "sub { shift->child_terminate }"
  
  You would want to use the latter example only if you wanted the child
  to terminate every time the registered handler is called.  Probably
  this is not what you want.
  
  Even if you don't need to call child_terminate() at the end of the
  request if you want the process to quit afterwards, here is an example
  of assigning the postprocessing handler. You might do this if you
  wanted to execute your own code a moment before the process quits.
  
    my $r = shift;
    $r->post_connection(\&exit_child);
    sub exit_child{
       # some logic here if needed
      $r->child_terminate;
    }
  
  The above is the code that is used by the C<Apache::SizeLimit> module
  which terminates processes that grow bigger than a value you choose.
  
  L<Apache::GTopLimit|modules/Apache__GTopLimit___Limit_Apache_httpd_processes> 
(based on
  I<libgtop> and C<GTop.pm>) is a similar module. It does the same
  thing, plus you can configure it to terminate processes when their
  shared memory shrinks below some specified size.
  
  
  
  
  
  
  =head1 die() and mod_perl
  
  When you write:
  
    open FILE, "foo" or die "Cannot open foo file for reading: $!";
  
  in a perl script and execute it--the script would die() if it is
  unable to open the file, by aborting the script execution, printing
  the death reason and quitting the Perl interpreter.
  
  You will hardly find a properly written Perl script that doesn't have
  at least one die() statement in it, if it has to cope with system
  calls and the like.
  
  A CGI script running under mod_cgi exits on its completion. The Perl
  interperter exits as well. So it doesn't really matter whether the
  interpreter quits because the script died by natural death (when the
  last statement was executed) or was aborted by a die() statement.
  
  In mod_perl we don't want the interpreter to quit. We already know
  that when the script completes its chores the interpeter won't quit.
  There is no reason why it should quit when the script has stopped
  because of die(). As a result calling die() won't quit the process.
  
  And this is how it works--when the die() gets triggered, it's
  mod_perl's C<$SIG{__DIE__}> handler that logs the error message and
  calls Apache::exit() instead of CORE::die(). Thus the script stops, but
  the process doesn't quit.
  
  Here is an example of such trapping code, although it isn't the real
  code:
  
    $SIG{__DIE__} = sub { print STDERR @_; Apache::exit(); }
  
  
  =head1 Return Codes
  
  C<Apache::Registry> normally assumes a return code of C<OK> (C<200>).
  If you want to send another return code, use C<$r-E<gt>status()>:
  
    use Apache::Constants qw(NOT_FOUND);
    $r->status(NOT_FOUND);
  
  Of course if you do that, you don't have to call
  C<$r-E<gt>send_http_header()> (assuming that you have C<PerlSendHeader
  Off>).
  
  
  =head1 Testing the Code from the Shell
  
  Your CGI scripts will B<not> yet run from the command line unless you
  use C<CGI::Switch> or C<CGI.pm> and have Perl 5.004 or later.  They
  must not make any direct calls to Apache's Perl API methods.
  
  
  
  
  
  =head1 I/O is different
  
  If you are using Perl 5.004 or later, most CGI scripts can run under
  mod_perl untouched. 
  
  If you're using 5.003, Perl's built-in C<read()> and C<print()>
  functions do not work as they do under CGI. If you're using C<CGI.pm>,
  use C<$query-E<gt>print> instead of plain ol' C<print()>.
  
  
  
  
  
  =head1 STDIN, STDOUT and STDERR streams
  
  In mod_perl both C<STDIN> and C<STDOUT> are tied to the socket the
  request came from. Because the C level STDOUT is not hooked up to the
  client, you can re-open the STDOUT filehandler using tie(). For
  example if you want to dup an STDOUT filehandler and for the code to
  work with mod_perl and without it, the following example will do:
  
    use constant IS_MODPERL => $ENV{MOD_PERL};
    if (IS_MODPERL) {
      tie *OUT, 'Apache';
    } else {
      open (OUT, ">-");
    }
  
  Note that C<OUT> was picked just as an example -- there is nothing
  special about it. If you are looking to redirect the STDOUT stream
  into a scalar, see the L<Redirecting STDOUT into a
  String|porting/Redirecting_STDOUT_into_a_Scalar> section.
  
  C<STDERR> is tied to the file defined by the C<ErrorLog> directive.
  
  
  
  =head1 Redirecting STDOUT into a Scalar
  
  Sometimes you have a situation where a black box functions prints the
  output to C<STDOUT> and you want to get this output into a
  scalar. This is just as valid under mod_perl, where you want the
  C<STDOUT> to be tied to the C<Apache> object. So that's where the
  C<IO::String> package comes to help. You can re-tie() the STDOUT (or
  any other filehandler to a string) by doing a simple select() on the
  C<IO::String> object and at the end to re-tie() the STDOUT back to its
  original stream:
  
    my $str;
    my $str_fh = IO::String->new($str);
    my $old_fh = select($str_fh);
    
    # some function that prints to currently selected file handler.
    print_stuff()
    
    # reset default fh to previous value
    select($old_fh) if defined $old_fh;
  
  
  
  =head1 Apache::print() and CORE::print()
  
  Under mod_perl C<CORE::print()> will redirect its data to
  C<Apache::print()> since the STDOUT filehandle is tied to the
  I<Apache> module. This allows us to run CGI scripts unmodified under
  C<Apache::Registry> by chaining the output of one content handler to
  the input of the other handler.
  
  C<Apache::print()> behaves mostly like the built-in I<print()>
  function. In addition it sets a timeout so that if the client
  connection is broken the handler won't wait forever trying to print
  data downstream to the client.
  
  There is also an optimization built into C<Apache::print()>.  If any
  of the arguments to the method are scalar references to strings, they
  are automatically dereferenced for you.  This avoids needless copying
  of large strings when passing them to subroutines. For example:
  
   $long_string = "A" x 10000000;
   $r->print(\$long_string);
  
  If you still want to print the reference you can always call:
  
    $r->print(\\$foo);
  
  or by forcing it into a scalar context:
  
    print(scalar($foo));
  
  
  
  =head1 Global Variables Persistence
  
  Since the child process generally doesn't exit before it has serviced
  several requests, global variables persist inside the same process
  from request to request. This means that you must never rely on the
  value of the global variable if it wasn't initialized at the beginning
  of the request processing. See "L<Variables globally, lexically scoped
  and fully qualified|perl/Variables_Globally_Lexically_Sc>" for more
  info.
  
  You should avoid using global variables unless it's impossible without
  them, because it will make code development harder and you will
  have to make certain that all the variables are initialized before
  they are used. Use my() scoped variables wherever you can.
  
  You should be especially careful with L<Perl Special
  Variables|perl/The_Scope_of_the_Special_Perl_Va> which cannot be
  lexically scoped.  You have to use local() instead.
  
  
  
  
  
  
  
  =head1 Generating correct HTTP Headers
  
  A HTTP response header consists of at least two fields. HTTP response
  and MIME type header C<Content-type>:
  
    HTTP/1.0 200 OK
    Content-Type: text/plain
  
  After adding one more new line, you can start printing the content. A
  more complete response includes the date timestamp and server
  type, for example:
  
    HTTP/1.0 200 OK
    Date: Tue, 28 Dec 1999 18:47:58 GMT
    Server: Apache/1.3.10-dev (Unix) mod_perl/1.21_01-dev
    Content-Type: text/plain
  
  To notify that the server was configured with KeepAlive Off, you need
  to tell the client that the connection was closed, with:
  
    Connection: close
  
  There can be other headers as well, like caching control and others
  specified by the HTTP protocol. You can code the response header with
  a single print():
  
      print qq{HTTP/1.1 200 OK    
    Date: Tue, 28 Dec 1999 18:49:41 GMT
    Server: Apache/1.3.10-dev (Unix) mod_perl/1.21_01-dev
    Connection: close
    Content-type: text/plain
    
      };
  
  or with a I<"here"> style print:
  
      print <<EOT;
    HTTP/1.1 200 OK    
    Date: Tue, 28 Dec 1999 18:49:41 GMT
    Server: Apache/1.3.10-dev (Unix) mod_perl/1.21_01-dev
    Connection: close
    Content-type: text/plain
    
    EOT
  
  Notice the double new line at the end. But you have to prepare a
  timestamp string (C<Apache::Util::ht_time()> does just this) and to
  know what server you are running under. You needed to send only the
  response MIME type (C<Content-type>) under mod_cgi, so why would you 
  want to do this manually under mod_perl? 
  
  Actually sometimes you do want to set some headers manually, but not
  every time. So mod_perl gives you the default set of headers, just like
  in the example above. And if you want to override or add more headers
  you can do that as well. Let's see how to do that.
  
  When writing your own handlers and scripts with the Perl Apache API
  the proper way to send the HTTP header is with the send_http_header()
  method. If you need to add or override methods you can use the
  header_out() method:
  
    $r->header_out("Server" => "Apache Next Generation 10.0");
    $r->header_out("Date" => "Tue, 28 Dec 1999 18:49:41 GMT");
  
  When you have prepared all the headers you send them with:
  
    $r->send_http_header;
  
  Some headers have special aliases:
  
    $r->content_type('text/plain');
  
  is the same as:
  
    $r->header_out("Content-type" => "text/plain");
  
  A typical handler looks like this:
  
    $r->content_type('text/plain');
    $r->send_http_header;
    return OK if $r->header_only;
  
  If the client issues an HTTP C<HEAD> request rather than the usual
  C<GET>, to be compliant with the HTTP protocol we should not send the
  document body, but only the HTTP header. When Apache receives a C<HEAD>
  request, I<header_only()> returns I<true>.  If we see that this has
  happened, we return from the handler immediately with an C<OK> status
  code.
  
  Generally, you don't need the explicit content type setting, since
  Apache does this for you, by looking up the MIME type of the request
  and by matching the extension of the URI in the MIME tables (from the
  I<mime.types> file). So if the request URI is I</welcome.html>, the
  C<text/html> content-type will be picked. However for CGI scripts or
  URIs that cannot be mapped by a known extension, you should set the
  appropriate type by using content_type() method.
  
  The situation is a little bit different with C<Apache::Registry> and
  similar handlers. If you take a basic CGI script like this:
  
    print "Content-type: text/plain\r\n\r\n";
    print "Hello world";
  
  it wouldn't work, because the HTTP header will not be sent out.  By
  default, mod_perl does not send any headers itself.  You may wish to
  change this by adding
  
    PerlSendHeader On
  
  in the C<Apache::Registry> C<E<lt>LocationE<gt>> section of your
  configuration.  Now, the response line and common headers will be sent
  as they are by mod_cgi.  Just as with mod_cgi, C<PerlSendHeader> will
  not send the MIME type and a terminating double newline. Your script
  must send that itself, e.g.:
  
    print "Content-type: text/html\r\n\r\n";
  
  According to HTTP specs, you should send "\cM\cJ", "\015\012" or
  "\0x0D\0x0A" string. The "\r\n" is the way to do that on UNIX and
  MS-DOS/Windows machines. However, on a Mac "\r\n" eq "\012\015",
  exactly the other way around.
  
  Note, that in most UNIX CGI scripts, developers use a simpler "\n\n"
  and not "\r\n\r\n". There are occasions where sending "\n" without
  "\r" can cause problems, make it a habit to always send "\r\n" every
  time.
  
  If you use an OS which uses the EBCDIC as character set (e.g.
  BS2000-Posix), you should use this method to send the Content-type
  header:
  
    shift->send_http_header('text/html');
  
  The C<PerlSendHeader On> directive tells mod_perl to intercept
  anything that looks like a header line (such as C<Content-Type:
  text/plain>) and automatically turn it into a correctly formatted
  HTTP/1.0 header, the same way it happens with CGI scripts running
  under mod_cgi.  This allows you to keep your CGI scripts unmodified.
  
  You can use C<$ENV{PERL_SEND_HEADER}> to find out whether
  C<PerlSendHeader> is B<On> or B<Off>. You use it in your module
  like this:
  
   if($ENV{PERL_SEND_HEADER}) {
       print "Content-type: text/html\r\n\r\n";
   }
   else {
       my $r = Apache->request;
       $r->content_type('text/html');
       $r->send_http_header;
   }
  
  
  Note that you can always use the code in the else part of the above
  example, no matter whether the C<PerlSendHeader> directive is On or
  Off.
  
  If you use C<CGI.pm>'s C<header()> function to generate HTTP headers,
  you do not need to activate this directive because C<CGI.pm> detects
  I<mod_perl> and calls C<send_http_header()> for you.
  
  There is no free lunch--you get the mod_cgi behavior at the expense of
  the small but finite overhead of parsing the text that is sent.  Note
  that mod_perl makes the assumption that individual headers are not
  split across print statements.
  
  The C<Apache::print()> routine has to gather up the headers that your
  script outputs, in order to pass them to C<$r-E<gt>send_http_header>.
  This happens in C<src/modules/perl/Apache.xs> (C<print>) and
  C<Apache/Apache.pm> (C<send_cgi_header>).  There is a shortcut in
  there, namely the assumption that each print statement contains one or
  more complete headers.  If for example you generate a C<Set-Cookie>
  header by multiple C<print()> statements, like this:
  
     print "Content-type: text/plain\n";
     print "Set-Cookie: iscookietext\; ";
     print "expires=Wednesday, 09-Nov-1999 00:00:00 GMT\; ";
     print "path=\/\; ";
     print "domain=\.mmyserver.com\; ";
     print "\r\n\r\n";
     print "hello";
  
  Your generated C<Set-Cookie> header is split over a number of print()
  statements and gets lost. The above example wouldn't work! Try this
  instead:
  
     my $cookie = "Set-Cookie: iscookietext\; ";
     $cookie .= "expires=Wednesday, 09-Nov-1999 00:00:00 GMT\; ";
     $cookie .= "path=\/\; ";
     $cookie .= "domain=\.mmyserver.com\; ";
     print "Content-type: text/plain\n",
     print "$cookie\r\n\r\n";
     print "hello";
  
  Of course using a special purpose cookie generator modules, like
  C<Apache::Cookie>, C<CGI::Cookie> etc is an even cleaner solution.
  
  Sometimes when you call a script you see an ugly S<"Content-Type:
  text/html"> displayed at the top of the page, and of course the rest
  of the HTML code won't be rendered correctly by the browser. As you
  have seen above, this generally happens when your code has already
  sent the header so you see the duplicate header rendered into the
  browser's page. This might happen when you call the C<CGI.pm>
  C<$q-E<gt>header> method or mod_perl's C<$r-E<gt>send_http_header>.
  
  If you have a complicated application where the header might be
  generated from many different places, depending on the calling logic,
  you might want to write a special subroutine that sends a header, and
  keeps track of whether the header has been already sent. Of course you
  can use a global variable to flag that the header has already been
  sent:
  
    use strict;
    use vars qw{$header_printed};
    $header_printed = 0;
    
    print_header("text/plain");
    print "It worked!\n";
    print_header("text/plain");
    
    sub print_header {
      my $type = shift || "text/html";
      unless  ($header_printed) {
        $header_printed = 1;
        my $r = Apache->request;
        $r->content_type($type);
        $r->send_http_header;
      }
    }
  
  C<$header_printed> is the variable that flags whether the header was
  sent or not and it gets initialized to false (0) at the beginning of
  each code invocation. Note that the second invocation of
  print_header() within the same code, will do nothing, since
  C<$header_printed> will become true after print_header() will be
  executed for the first time.
  
  A solution that is a little bit more memory friendly is to use a fully
  qualified variable instead:
  
    use strict;
    $main::header_printed = 0;
    
    print_header("text/plain");
    print "It worked!\n";
    print_header("text/plain");
    
    sub print_header {
      my $type = shift || "text/html";
      unless  ($main::header_printed) {
        $main::header_printed = 1;
        my $r = Apache->request;
        $r->content_type($type);
        $r->send_http_header;
      }
    }
  
  We just removed the global variable predeclaration, which allowed us
  to use C<$header_printed> under C<"use strict"> and replaced
  C<$header_printed> with C<$main::header_printed>;
  
  You may become tempted to use a more elegant Perl solution--the nested
  subroutine effect which seems to be a natural approach to take here.
  Unfortunately it will not work. If the process was starting fresh for
  each script or handler, like with plain mod_cgi scripts, it would work
  just fine:
  
    use strict;
    
    print_header("text/plain");
    print "It worked!\n";
    print_header("text/plain");
    
    {
      my $header_printed = 0;
      sub print_header {
        my $type = shift || "text/html";
        unless  ($header_printed) {
          $header_printed = 1;
          my $r = Apache->request;
          $r->content_type($type);
          $r->send_http_header;
        }
      }
    }
  
  In this code C<$header_printed> is declared as lexically scoped (with
  my()) outside the subroutine print_header() and modified inside of
  it. Curly braces define the block which limits the scope of the
  lexically variable.
  
  This means that once print_header() sets it to 1, it will stay 1 as
  long as the code is running. So all subsequent calls to this
  subroutine will just return without doing a thing. This would serve
  our purpose, but unfortunately it will work only for the first time
  the script is invoked within a process. When the script is executed
  for the second or subsequent times and is served by the same
  process--the header will not be printed anymore, since print_header()
  will remember that the value of C<$header_printed> is equal to 1--it
  won't be reinitialized, since the subroutine won't be recompiled.
  
  Why can't we use a lexical without hitting the nested subroutine
  effect? Because when we've discussed L<Apache::Registry
  secrets|porting/Exposing_Apache_Registry_secret> we have seen that the
  code is wrapped in a C<handler> routine, effectively turning any
  subroutines within the file a script resides in into nested
  subroutines. Hence we are forced to use a global in this situation.
  
  Let's make our smart method more elaborate with respect to the
  C<PerlSendHeader> directive, so that it always does the right
  thing. It's especially important if you write an application that you
  are going to distribute, hopefully under one of the Open Source or GPL
  licenses.
  
  You can continue to improve this subroutine even further to handle
  additional headers, such as cookies.
  
  See also L<Correct Headers--A quick guide for mod_perl
  users|correct_headers/>
  
  
  
  
  
  
  
  
  
  
  
  
  
  =head1 NPH (Non Parsed Headers) scripts
  
  To run a Non Parsed Header CGI script under mod_perl, simply add to
  your code:
  
    local $| = 1;
  
  And if you normally set C<PerlSendHeader On>, add this to your
  server's configuration file:
  
    <Files */nph-*>
      PerlSendHeader Off
    </Files>
  
  
  
  
  
  
  
  
  =head1 BEGIN blocks 
  
  Perl executes C<BEGIN> blocks as soon as possible, at the time of
  compiling the code. The same is true under mod_perl. However, since
  mod_perl normally only compiles scripts and modules once, either in
  the parent server or once per-child, C<BEGIN> blocks in that code will
  only be run once.  As the C<perlmod> manpage explains, once a C<BEGIN>
  block has run, it is immediately undefined. In the mod_perl
  environment, this means that C<BEGIN> blocks will not be run during
  the response to an incoming request unless that request happens to be
  the one that causes the compilation of the code.
  
  C<BEGIN> blocks in modules and files pulled in via C<require()> or
  C<use()> will be executed:
  
  =over 4
  
  =item *
  
  Only once, if pulled in by the parent process.
  
  =item *
  
  Once per-child process if not pulled in by the parent process.
  
  =item *
  
  An additional time, once per child process if the module is pulled in
  off disk again via C<Apache::StatINC>.
  
  =item *
  
  An additional time, in the parent process on each restart if
  C<PerlFreshRestart> is C<On>.
  
  =item *
  
  Unpredictable if you fiddle with C<%INC> yourself.
  
  =back 
  
  
  C<BEGIN> blocks in C<Apache::Registry> scripts will be executed, as
  above plus:
  
  =over 4
  
  =item *
  
  Only once, if pulled in by the parent process via
  
  C<Apache::RegistryLoader>.
  
  =item *
  
  Once per-child process if not pulled in by the parent process.
  
  =item *
  
  An additional time, once per child process, each time the script file
  changes on disk.
  
  =item *
  
  An additional time, in the parent process on each restart if pulled in
  by the parent process via C<Apache::RegistryLoader> and
  C<PerlFreshRestart> is C<On>.
  
  =back
  
  Make sure you read L<Evil things might happen when using
  PerlFreshRestart|troubleshooting/Evil_things_might_happen_when_us>.
  
  
  
  
  
  
  
  
  
  
  
  
  =head1 END blocks
  
  As the C<perlmod> manpage explains, an C<END> subroutine is executed as
  late as possible, that is, when the interpreter exits. In the mod_perl
  environment, the interpreter does not exit until the server
  shuts down. However, mod_perl does make a special case for
  C<Apache::Registry> scripts.
  
  Normally, C<END> blocks are executed by Perl during its B<perl_run()>
  function. This is called once each time the Perl program is executed,
  i.e. under mod_cgi, once per invocation of the CGI script. However,
  mod_perl only calls B<perl_run()> once, during server startup. Any
  C<END> blocks encountered during main server startup, i.e. those
  pulled in by the C<PerlRequire> or by any C<PerlModule>, are
  suspended.
  
  Except during the cleanup phase, any C<END> blocks encountered during
  compilation of C<Apache::Registry> scripts, including subsequent
  invocations when the script is cached in memory, are called after the
  script has completed.
  
  All other C<END> blocks encountered during other C<Perl*Handler>
  call-backs, e.g. C<PerlChildInitHandler>, will be suspended while the
  process is running and called during C<child_exit()> when the process
  is shutting down. Module authors might wish to use
  C<$r-E<gt>register_cleanup()> as an alternative to C<END> blocks if
  this behavior is not desirable. C<$r-E<gt>register_cleanup()> is
  called at the CleanUp processing phase of each request and thus can be
  used to emulate plain perl's C<END{}> block behavior.
  
  The last paragraph is very important for handling the case of L<'User
  Pressed the Stop Button'|debug/Handling_the_User_pressed_Stop_>.
  
  If you only want something to run once in the parent on shutdown or
  restart you can use  C<$r-E<gt>register_cleanup()> in the I<startup.pl>.
  
    #PerlRequire startup.pl
    warn "parent pid is $$\n";
    Apache->server->register_cleanup
      (sub { warn "server cleanup in $$\n"});
  
  This is usually useful when some server wide cleanup should be
  performed when the server is stopped or restarted.
  
  =head1 CHECK Blocks
  
  -- available since perl5.6.0
  
  This block runs when compilation is complete, but before the program
  starts. B<CHECK> can mean "checkpoint" or "double-check" or even jus
  "stop").
  
  Perl only calls B<CHECK> blocks during perl_parse(), which mod_perl
  calls once at startup time.  Therefore B<CHECK> blocks don't work for
  the same reason this doesn't:
  
    % perl -e 'eval qq(CHECK { print "ok\n" })'
  
  
  
  
  
  
  =head1 Command Line Switches (-w, -T, etc)
  
  Normally when you run perl from the command line, you have the shell
  invoke it with C<#!/bin/perl> (sometimes referred to as the shebang
  line).  In scripts running under mod_cgi, you may use perl execution
  switch arguments as described in the C<perlrun> manpage, such as
  C<-w>, C<-T> or C<-d>. Since scripts running under mod_perl don't need
  the shebang line, all switches except C<-w> are ignored by mod_perl.
  This feature was added for a backward compatibility with CGI scripts.
  
  Most command line switches have a special variable equivalent which
  allows them to be set/unset in code. Consult the C<perlvar> manpage
  for more details.
  
  =head2 Warnings
  
  There are three ways to enable warnings:
  
  =over
  
  =item * Globally to all Processes
  
  Setting:
  
    PerlWarn On
  
  in C<httpd.conf> will turn warnings B<On> in any script.
  
  You can then fine tune your code, turning warnings B<Off> and B<On> by
  setting the C<$^W> variable in your scripts.
  
  =item * Locally to a script
  
    #!/usr/bin/perl -w
  
  will turn warnings B<On> for the scope of the script.  You can turn
  them B<Off> and B<On> in the script by setting the C<$^W> variable as
  noted above.
  
  =item * Locally to a block
  
  This code turns warnings mode B<On> for the scope of the block.
  
    {
      local $^W = 1;
      # some code
    }
    # $^W assumes its previous value here
  
  This turns it B<Off>:
  
    {
      local $^W = 0;
      # some code
    }
    # $^W assumes its previous value here
  
  Note, that if you forget the C<local> operator this code will affect the
  child processing the current request, and all the subsequent requests
  processed by that child.  Thus
  
    $^W = 0;
  
  will turn the warnings I<Off>, no matter what.
  
  If you want to turn warnings I<On> for the scope of the whole file, as
  in the previous item, you can do this by adding:
  
    local $^W = 1;
  
  at the beginning of the file. Since a file is effectively a block,
  file scope behaves like a block's curly braces C<{ }> and C<local $^W>
  at the start of the file will be effective for the whole file.
  
  =back
  
  While having warning mode turned B<On> is essential for a development
  server, you should turn it globally B<Off> in a production server,
  since, for example, if every served request generates only one
  warning, and your server serves millions of requests per day, your log
  file will eat up all of your disk space and your system will die.
  
  =head2 Taint Mode
  
  Perl's C<-T> switch enables I<Taint> mode. (META: Link to security
  chapter). If you aren't forcing all your scripts to run under B<Taint>
  mode you are looking for trouble from malicious users. (See the
  I<perlsec> manpage for more information)
  
  If you have some scripts that won't run under Taint mode, run only the
  ones that run under mod_perl with Taint mode enabled and the rest on
  another server with Taint mode disabled -- this can be either a
  mod_cgi in the front-end server or another back-end mod_perl server.
  You can use the mod_rewrite module and redirect requests based on the
  file extensions. For example you can use I<.tcgi> for the taint-clean
  scripts, and I<cgi> for the rest.
  
  When you have this setup you can start working toward cleaning the
  rest of the scripts, to make them run under the Taint mode. Just
  because you have a few dirty scripts doesn't mean that you should
  jeopardize your whole service.
  
  Since the C<-T> switch doesn't have an equivalent perl variable,
  mod_perl provides the C<PerlTaintCheck> directive to turn on taint
  checks. In C<httpd.conf>, enable this mode with:
  
    PerlTaintCheck On
  
  Now any code compiled inside httpd will be taint checked.
  
  If you use the C<-T> switch, Perl will warn you that you should use
  the C<PerlTaintCheck> configuration directive and will otherwise
  ignore it.
  
  
  
  =head2 Other switches
  
  Finally, if you still need to set additional perl startup flags
  such as C<-d> and C<-D>, you can use an environment variable
  C<PERL5OPT>. Switches in this variable are treated as if they were on
  every Perl command line.
  
  Only the C<-[DIMUdmw]> switches are allowed.
  
  When the C<PerlTaintCheck> variable is turned on, the value of
  C<PERL5OPT> will be ignored.
  
  [META: verify]
  
  See also
  
L<Apache::PerlRun|modules/Apache__PerlRun___Run_unaltered_CGI_scripts_under_mod_perl>.
  
  
  
  
  
  
  
  
  
  
  
  =head1 The strict pragma
  
  It's _absolutely_ mandatory (at least for development) to start all
  your scripts with:
  
    use strict;
  
  If needed, you can always turn off the 'strict' pragma or a part of it
  inside the block, e.g:
  
    {
      no strict 'refs';
      ... some code
    }
  
  It's more important to have the C<strict> pragma enabled under mod_perl
  than anywhere else.  While it's not required by the language, its use
  cannot be too strongly recommended. It will save you a great deal of
  time.  And, of course, clean scripts will still run under mod_cgi
  (plain CGI)!
  
  
  
  
  
  
  
  
  
  
  
  
  
  =head1 Passing ENV variables to CGI
  
  To pass an environment variable from I<httpd.conf>, add to it:
  
    PerlSetEnv key val
    PerlPassEnv key
  
  e.g.:
  
    PerlSetEnv PERLDB_OPTS "NonStop=1 LineInfo=/tmp/db.out AutoTrace=1"
  
  will set C<$ENV{PERLDB_OPTS}>, and it will be accessible in every child.
  
  C<%ENV> is only set up for CGI emulation. If you are using the API,
  you should use C<$r-E<gt>subprocess_env>, C<$r-E<gt>notes> or
  C<$r-E<gt>pnotes> for passing data around between handlers. C<%ENV> is
  slow because it must update the underlying C environment table.  It
  also insecure since its use exposes the data on systems which allow
  users to see the environment with C<ps>.
  
  In any case, C<%ENV> and the tables used by those methods are all
  cleared after the request is served. 
  
  The Perl C<%ENV> is cleared during startup, but the C environment is
  left in tact.  With a combo of forking C<`env`> and C<E<lt>PerlE<gt>>
  sections you can do even do wildcards matching.  For example, this
  passes all environment variables that begin with the letter H:
  
    <Perl>
      local $ENV{PATH} = '/usr/bin';
      local $_;
      
      for (`env`) {
          next unless /^(H.*)=/;
          push @PassEnv, $1;
      }
    </Perl>
  
  See also L<PerlSetupEnv|performance/PerlSetupEnv_Off> which can
  enable/disable environment variables settings.
  
  
  
  
  
  
  
  =head1 -M and other time() file tests under mod_perl
  
  Under mod_perl, files that have been created after the server's
  (child) startup are reported as having a negative age with C<-M>
  (C<-C> C<-A>) test. This is obvious if you remember that you will get
  the negative result if the server was started before the file was
  created.  It's normal behavior with perl.
  
  If you want to have C<-M> report the time relative to the current
  request, you should reset the C<$^T> variable just as with any other
  perl script. Add:
  
    local $^T = time;
  
  at the beginning of the script.
  
  Another even simpler solution would be to specify a fixup handler,
  which will be executed before each script is run:
  
    sub Apache::PerlBaseTime::handler {
      $^T = shift->request_time;
      return Apache::Constants::DECLINED;
    }
  
  and then in the I<httpd.conf>:
  
    PerlFixupHandler Apache::PerlBaseTime
  
  This technique is better performance-wise as it skips the time()
  system call, and uses the already available time of the request has
  been started at via C<$r-E<gt>request_time> method.
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  =head1 Apache and syslog
  
  When native syslog support is enabled, the stderr stream will be
  redirected to C</dev/null>!
  
  It has nothing to do with mod_perl (plain Apache does the same). Doug
  wrote the 
L<Apache::LogSTDERR|porting/CGI_to_mod_perl_Porting__mod_perl_Coding_guidelines__>
 module to work
  around this.
  
  
  
  
  =head1 File tests operators
  
  Remember that with mod_perl you might get negative times when you use
  file test operators like C<-M> -- last modification time, C<-A> --
  last access time, C<-C> -- last inode-change time, and others.  C<-M>
  returns the difference in time between the modification time of the
  file and the time the script was started.  Because the C<^T> variable
  is not reset on each script invocation, and is equal to the time when
  the process was forked, you might want to perform:
  
    $^T = time;
  
  at the beginning of your scripts to simulate the regular perl script
  behaviour of file tests.
  
  META: Above is near duplicate of "-M and other time() file tests under 
mod_perl"
  make a link instead
  
  
  
  
  
  
  =head1 Filehandlers and locks leakages
  
  META: duplication at 
  debug.pod: =head3 Safe Resource Locking
  
  When you write a script running under mod_cgi, you can get away with
  sloppy programming, like opening a file and letting the interpreter 
  close it for you when the script had finished its run:
  
    open IN, "in.txt" or die "Cannot open in.txt for reading : $!\n";
  
  For mod_perl, before the end of the script you B<must> C<close()> any
  files you opened!
  
    close IN;
  
  If you forget to C<close()>, you might get file descriptor leakage
  and (if you C<flock()ed> on this file descriptor) also unlock problems.
  
  Even if you do call close(), if for some reason the interpreter
  was stopped before the C<close()> call, the leakage will still happen.
  See for example L<Handling the 'User pressed Stop button'
  case|debug/Handling_the_User_pressed_Stop_>.  After a long run without
  restarting Apache your machine might run out of file descriptors, and
  worse, files might be left locked and unusable.
  
  What can you do? Use C<IO::File> (and the other C<IO::*> modules).
  This allows you to assign the file handler to variable which can be
  C<my()> (lexically) scoped. When this variable goes out of scope the
  file or other file system entity will be properly closed (and unlocked
  if it was locked). Lexically scoped variables will always go out of
  scope at the end of the script's invocation even if it was aborted in
  the middle.  If the variable was defined inside some internal block,
  it will go out of scope at the end of the block. For example:
  
    {
      my $fh = IO::File->new("filename") or die $!;
      # read from $fh
    } # ...$fh is closed automatically at end of block, without leaks.
  
  As I have just mentioned, you don't have to create a special block for
  this purpose.  A script in a file is effectively written in a block with 
  the same scope as the file, so you can simply write:
  
    my $fh = IO::File->new("filename") or die $!;
      # read from $fh
      # ...$fh is closed automatically at end of script, without leaks.
  
  Using a C<{ BLOCK }>) makes sure is that the file is closed the moment
  that the end of the block is reached.
  
  An even faster and lighter technique is to use C<Symbol.pm>:
  
    my $fh = Symbol::gensym();
    open $fh, "filename" or die $!;
  
  Use these approaches to ensure you have no leakages, but don't be too
  lazy to write C<close()> statements.  Make it a habit.
  
  Under perl 5.6.0 we can do this instead:
  
    open my $fh, $filename or die $! ;
  
  
  
  
  
  
  
  
  
  
  =head1 Code has been changed, but it seems the script is running the old code
  
  Files pulled in via B<use> or B<require> statements are not
  automatically reloaded when they change on disk. See L<Reloading Modules
  and Required Files|porting/Reloading_Modules_and_Required_F> for more
  info.
  
  
  
  
  =head1 The Script Is Too Dirty, But It Does The Job And I Cannot Afford To 
Rewrite It.
  
  You still can win from using mod_perl. 
  
  One approach is to replace the C<Apache::Registry> handler with
  C<Apache::PerlRun> and define a new location.  The script can reside
  in the same directory on the disk.
  
    # httpd.conf
    Alias /cgi-perl/ /home/httpd/cgi/
    
    <Location /cgi-perl>
      #AllowOverride None
      SetHandler perl-script
      PerlHandler Apache::PerlRun
      Options ExecCGI
      allow from all
      PerlSendHeader On
    </Location>
  
  See L<Apache::PerlRun--a closer look|porting/Apache__PerlRun__a_closer_look>
  
  Another "bad", but workable method is to set C<MaxRequestsPerChild> to
  1, which will force each child to exit after serving only one request.
  You will get the preloaded modules, etc., but the script will be
  compiled for each request, then be thrown away. This isn't good for
  "high-traffic" sites, as the parent server will need to fork a new
  child each time one is killed.  You can fiddle with C<MaxStartServers>
  and C<MinSpareServers>, so that the parent pre-spawns more servers
  than actually required and the killed one will immediately be replaced
  with a fresh one.  Probably that's not what you want.
  
  
  
  
  
  
  
  
  
  
  
  
  =head1 Apache::PerlRun--a closer look
  
  C<Apache::PerlRun> gives you the benefit of preloaded Perl and its
  modules. This module's handler emulates the CGI environment, allowing
  programmers to write scripts that run under CGI or mod_perl without
  any change.  Unlike C<Apache::Registry>, the C<Apache::PerlRun>
  handler does not cache the script inside a subroutine.  Scripts
  will be "compiled" on each request. After the script has run, its name
  space is flushed of all variables and subroutines. Still, you don't
  have the overhead of loading the Perl interpreter and the compilation
  time of the standard modules. If your script is very light, but uses
  lots of standard modules, you will see no difference between
  C<Apache::PerlRun> and C<Apache::Registry>!.
  
  Be aware though, that if you use packages that use internal variables
  that have circular references, they will be not flushed!!!
  C<Apache::PerlRun> only flushes your script's name space, which does
  not include any other required packages' name spaces.  If there's a
  reference to a C<my()> scoped variable that's keeping it from being
  destroyed after leaving the eval scope (of C<Apache::PerlRun>), that
  cleanup might not be taken care of until the server is shutdown and
  C<perl_destruct()> is run, which always happens after running command
  line scripts.  Consider this example:
  
    package Foo;
    sub new { bless {} }
    sub DESTROY {
      warn "Foo->DESTROY\n";
    }
    
    eval <<'EOF';
    package my_script;
    my $self = Foo->new;
    #$self->{circle} = $self;
    EOF
    
    print $@ if $@;
    print "Done with script\n";
  
  When executed as a plain script you'll see:
  
    Foo->DESTROY
    Done with script
  
  Then, uncomment the line where C<$self> makes a circular reference,
  and you'll see:
  
    Done with script
    Foo->DESTROY
  
  If you run this example with the circular reference enabled under mod_perl
  you won't see C<Foo-E<gt>DESTROY> until server shutdown, or until
  your module properly takes care of things. Note that the warn() call
  logs its messages to the I<error_log> file, so you should expect the
  output there and not together with STDOUT.
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  =head1 Sharing variables between processes
  
  META: to be completed
  
  =over 4
  
  =item *
  
  Global variables initialized at server startup, through the Perl
  startup file, can be shared between processes, until modified by some
  of the processes. e.g. when you write:
  
    $My::debug = 1;
  
  all processes will read the same value. If one of the processes
  changes that value to C<0>, it will still be equal to C<1> for any
  other process, but not for the one which actually made the
  change. When a process modifies a shared variable, it becomes the
  process' private copy.
  
  =item *
  
  C<IPC::Shareable> can be used to share variables between children.
  
  =item *
  
  libmm
  
  =item *
  
  other methods?
  
  =back
  
  
  
  
  =head1 Transitioning from Apache::Registry to Apache handlers
  
  Even if you are a CGI script die-hard at some point you might want to
  move a few or all your scripts to Apache Perl handlers. Actually this
  is an easy task, since we saw already what C<Apache::Registry> makes
  our scripts appear to Apache to be Perl handlers.
  
  When you no longer need backward mod_cgi compatibility you
  can benefit from the Perl libraries working only under
  mod_perl. We will see why in a moment.
  
  Let's see an example. We will start with a mod_cgi compatible CGI
  script running under C<Apache::Registry>, transpose it into a Perl
  content handler and then convert it to use C<Apache::Request> and
  C<Apache::Cookie>.
  
  =head2 Starting with mod_cgi Compatible Script
  
  This is the original script's code we are going to work with:
  
    cookie_script.pl
    ----------------
    use strict;
    use CGI;
    use CGI::Cookie;
    use vars qw($q $switch $status $sessionID);
    
    init();
    print_header();
    print_status();
    
    ### <-- subroutines --> ###
    
    # the init code
    ###########
    sub init{
      $q = new CGI;
      
      $switch = $q->param("switch") ? 1 : 0;
      
        # try to retrieve the session ID
                # fetch existing cookies
      my %cookies = CGI::Cookie->fetch;
      $sessionID = exists $cookies{'sessionID'} 
         ? $cookies{'sessionID'}->value : '';
      
        # 0 = not running, 1 = running
      $status = $sessionID ? 1 : 0;
      
        # switch status if asked to
      $status = ($status+1) % 2 if $switch;
      
      if ($status){
          # preserve sessionID if exists or create a new one
        $sessionID ||= generate_sessionID() if $status;
      } else {
          # delete the sessionID
        $sessionID = '';
      }
      
    } # end of sub init
    
    #################
    sub print_header{
        # prepare a cookie
      my $c = CGI::Cookie->new
        ('-name'    => 'sessionID',
         '-value'   => $sessionID,
         '-expires' => '+1h');
      
      print $q->header
        (-type   => 'text/html',
         -cookie => $c);
    
    } # end of sub print_header
    
    
    # print the current Session status and a form to toggle the status
    #################
    sub print_status{
      
      print qq{<HTML><HEAD><TITLE>Cookie</TITLE></HEAD><BODY>};
    
        # print status
      print "<B>Status:</B> ",
            $status
              ? "Session is running with ID: $sessionID"
              : "No session is running";
      
      
        # change status form
      my $button_label = $status ? "Stop" : "Start";
      print qq{<HR>
         <FORM>
           <INPUT TYPE=SUBMIT NAME=switch VALUE=" $button_label "> 
         </FORM>
             };
      
      print qq{</BODY></HTML>};
    
    } # end of sub print_status
    
    # A dummy ID generator
    # Replace with a real session ID generator
    ########################
    sub generate_sessionID {
      return scalar localtime;
    } # end of sub generate_sessionID
  
  
  The code is very simple. It creates a session if you've pressed the
  I<'Start'> button or deletes it if you've pressed the I<'Stop'>
  button. The session is stored and retrieved using the cookies
  technique.
  
  Note that we have split the obviously simple and short code into 
  three logical units, by putting the code into three subroutines.
  init() to initialize global variables and parse incoming data,
  print_header() to print the HTTP headers including the cookie header,
  and finally print_status() to generate the output. Later we will see
  that this logical separation will allow us an easy conversion to Perl
  content handler code.
  
  We have used global variables for a few variables since we didn't want
  to pass them from function to function. In a big project you should be
  very restrictive about what variables should be allowed to be global,
  if any at all. In any case, the init() subroutine makes sure all these
  variables are re-initialized for each code reinvocation.
  
  Note that we have used a very simple generate_sessionID() function
  that returns a date string (i.e. S<Wed Apr 12 15:02:23 2000>) as a
  session ID. You want to replace this one with code which generates a
  unique session every time it was called. And it should be secure,
  i.e. users will not be able to forge one and do nasty things.
  
  
  
  =head2 Converting into Perl Content Handler
  
  Now let's convert this script into a content handler. There are two
  parts to this task; the first one is to configure Apache to run the new
  code as a Perl handler, the second one is to modify the code itself.
  
  First we add the following snippet to I<httpd.conf>:
  
    PerlModule Test::Cookie
    <Location /test/cookie>
      SetHandler perl-script
      PerlHandler Test::Cookie
    </Location>
  
  After we restart the server, when there is a request whose URI starts
  with I</test/cookie>, Apache will execute the C<Test::Cookie::handler()>
  subroutine as a content handler.  We made sure to preload the
  C<Test::Cookie> module at server start-up, with the C<PerlModule>
  directive.
  
  Now we are going to modify the script itself. We copy the content to
  the file I<Cookie.pm> and place it into one of the directories listed
  in C<@INC>. For example if I</home/httpd/perl> is a part of C<@INC>
  and since we want to call this package C<Test::Cookie>, we can put
  I<Cookie.pm> into the I</home/httpd/perl/Test/> directory.
  
  So this is the new code. Notice that all the subroutines were left
  unmodified from the original script, so to make the differences clear
  we do not repeat them here.
  
    Test/Cookie.pm
    --------------
    package Test::Cookie;
    use Apache::Constants qw(:common);
    
    use strict;
    use CGI;
    use CGI::Cookie;
    use vars qw($q $switch $status $sessionID);
    
    sub handler{
      my $r = shift;
      Apache->request($r);
    
      init();
      print_header();
      print_status();
    
      return OK;
    }
    
    ### <-- subroutines --> ###
    # all subroutines as before
    
    1;
  
  As you see there are two lines added to the beginning of the code:
  
    package Test::Cookie;
    use Apache::Constants qw(:common);
  
  The first one declares the package name and the second one imports
  some symbols commonly used in Perl handlers to return status codes.
  
    use strict;
    use CGI;
    use CGI::Cookie;
    use vars qw($q $switch $status $sessionID);
  
  This code is left unchanged just as before.
  
    sub handler{
      my $r = shift;
      Apache->request($r);
    
      init();
      print_header();
      print_status();
    
      return OK;
    }
  
  Each content handler (and any other handler) should begin with
  a subroutine called handler(). This subroutine is called
  when a request's URI starts with I</test/cookie> as per our
  configuration. Of course you can choose a different name, for example
  execute(), but then you must explicitly use it in the configuration
  directives in the following way:
  
    PerlModule Test::Cookie
    <Location /test/cookie>
      SetHandler perl-script
      PerlHandler Test::Cookie::execute
    </Location>
  
  But we will use the default name, handler(). 
  
  The handler() subroutine is just like any other subroutine, but
  generally it has the following structure:
  
    sub handler{
      my $r = shift;
     
      # the code
    
      # status (OK, DECLINED or else)
      return OK;
    }
  
  First we get the request object by shifting it from C<@_> and
  assigning it to the C<$r> variable.
  
  Second we write the code that does the processing of the request.
  
  Third we return the status of the execution. There are many possible
  statuses, the most commonly used are C<OK> and C<DECLINED>, which tell the
  server whether they have completed the request phase that the handler was
  assigned to do or not.  If not, another handler must
  complete the processing. C<Apache::Constants> imports these two and
  other some commonly used status codes.
  
  So in our example all we had to do was to wrap the three calls:
  
      init();
      print_header();
      print_status();
  
  inside:
  
    sub handler{
      my $r = shift;
      Apache->request($r);
    
      return OK;
    }
  
  There is one line we didn't discuss:
  
      Apache->request($r);
  
  Since we use C<CGI.pm>, it relies on the fact that C<$r> was set in the
  C<Apache> module. C<Apache::Registry> did that behind the scenes.
  Since we don't use C<Apache::Registry> here, we have to do that
  ourselves.
  
  The one last thing we should do is to add C<1;> at the end of the
  module, just like with any Perl module, so C<PerlModule> will not fail
  when it tries to load C<Test::Cookie>.
  
  So to summarize, we took the original script's code and added the
  following eight lines:
  
    package Test::Cookie;
    use Apache::Constants qw(:common);
    
    sub handler{
      my $r = shift;
      Apache->request($r);
    
      return OK;
    }
    1;
  
  
  and now we have a fully fledged Perl Content Handler.
  
  =head2 Converting to use Apache Perl Modules
  
  So now we have a complete PerlHandler, let's convert it to use Apache
  Perl modules. This breaks the backward compatibility, but gives us
  better performance, mainly because the internals of many of these Perl
  modules are implemented in C, therefore we should get a significant
  improvement in speed. The section "L<TMTOWTDI: Convenience and
  Performance|performance/TMTOWTDI_Convenience_and_Habit_>"
  compares the three approaches.
  
  
  What we are going to do is to replace C<CGI.pm> and C<CGI::Cookie>
  with C<Apache::Request> and C<Apache::Cookie> respectively.  The two
  modules are written in C with the XS interface to Perl, which makes
  code much faster if it utilizes any of these modules a lot.
  C<Apache::Request> uses an API similar to the one C<CGI> uses, the
  same goes for C<Apache::Cookie> and C<CGI::Cookie>. This allows an
  easy porting process. Basically we just replace:
  
    use CGI;
    $q = new CGI;
  
  with:
  
    use Apache::Request ();
    my $q = Apache::Request->new($r);
  
  and
  
    use CGI::Cookie ();
    my $cookie = CGI::Cookie->new(...)
  
  with
  
    use Apache::Cookie ();
    my $cookie = Apache::Cookie->new($r, ...);
  
  This is the new code for C<Test::Cookie2>:
  
    Test/Cookie2.pm
    --------------
    package Test::Cookie2;
    use Apache::Constants qw(:common);
    
    use strict;
    use Apache::Request;
    use Apache::Cookie ();
    use vars qw($r $q $switch $status $sessionID);
    
    sub handler{
      $r = shift;
    
      init();
      print_header();
      print_status();
    
      return OK;
    }
    
    ### <-- subroutines --> ###
    
    # the init code
    ###########
    sub init{
    
      $q = Apache::Request->new($r);
      $switch = $q->param("switch") ? 1 : 0;
      
          # fetch existing cookies
      my %cookies = Apache::Cookie->fetch;
        # try to retrieve the session ID
      $sessionID = exists $cookies{'sessionID'} 
         ? $cookies{'sessionID'}->value : '';
      
        # 0 = not running, 1 = running
      $status = $sessionID ? 1 : 0;
      
        # switch status if asked to
      $status = ($status+1) % 2 if $switch;
      
      if ($status){
          # preserve sessionID if exists or create a new one
        $sessionID ||= generate_sessionID() if $status;
      } else {
          # delete the sessionID
        $sessionID = '';
      }
      
      
    } # end of sub init
    
    
    #################
    sub print_header{
        # prepare a cookie
      my $c = Apache::Cookie->new
        ($r,
         -name    => 'sessionID',
         -value   => $sessionID,
         -expires => '+1h');
    
        # Add a Set-Cookie header to the outgoing headers table
      $c->bake;
    
      $r->send_http_header('text/html');
      
    } # end of sub print_header
    
    
    # print the current Session status and a form to toggle the status
    #################
    sub print_status{
        
      print qq{<HTML><HEAD><TITLE>Cookie</TITLE></HEAD><BODY>};
      
        # print status
      print "<B>Status:</B> ",
            $status
              ? "Session is running with ID: $sessionID"
              : "No session is running";
      
      
        # change status form
      my $button_label = $status ? "Stop" : "Start";
      print qq{<HR>
         <FORM>
           <INPUT TYPE=SUBMIT NAME=switch VALUE=" $button_label "> 
         </FORM>
             };
      
      print qq{</BODY></HTML>};
    
    } # end of sub print_status
     
    # replace with a real session ID generator
    ########################
    sub generate_sessionID {
      return scalar localtime;
    }
    
    1;
  
  The only other changes are in the print_header() function, where
  instead of passing the cookie code to the C<CGI>'s header() to
  return a proper HTTP header:
  
      print $q->header
        (-type   => 'text/html',
         -cookie => $c);
  
  we do it in two stages.
  
      $c->bake;
  
  Adds a C<Set-Cookie> header to the outgoing headers table, and:
  
      $r->send_http_header('text/html');
  
  sends out the header itself. We have also eliminated:
  
      Apache->request($r);
  
  since we don't rely on C<CGI.pm> any more and in this case we don't
  need it.
  
  The rest of the code is unchanged.
  
  Of course we add the following snippet to I<httpd.conf>:
  
    PerlModule Test::Cookie2
    <Location /test/cookie2>
      SetHandler perl-script
      PerlHandler Test::Cookie2
    </Location>
  
  So now the magic URI that will trigger the above code execution will
  be the one starting with I</test/cookie2> .  We save the code in
  the file I</home/httpd/perl/Test/Cookie2.pm> since we have called this
  package C<Test::Cookie2>.
  
  
  =head2 Conclusion
  
  If your took care to write the original plain CGI script's code in a
  clean and modular way, you can see that the transition is a very
  simple one and doesn't take a lot of effort. Almost no code was
  modified.
  
  
  
  
  =cut
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/scenario.pod
  
  Index: scenario.pod
  ===================================================================
  =head1 NAME
  
  Real World Scenarios
  
  =head1 Assumptions 
  
  I will assume for this section that you are familiar with plain (not
  mod_perl enabled) Apache, its compilation and configuration. In all
  configuration and code examples I will use I<localhost> or
  I<www.example.com> as a hostname. For the testing on a local machine,
  I<localhost> would be just fine. If you are using the real name of
  your machine make sure to replace I<www.example.com> with the name of
  your machine.
  
  =head1 Standalone mod_perl Enabled Apache Server
  
  =head2 Installation in 10 lines
  
  The Installation is very simple.  This example shows installation
  on the Linux operating system.
  
    % cd /usr/src
    % lwp-download http://www.apache.org/dist/apache_x.x.x.tar.gz
    % lwp-download http://perl.apache.org/dist/mod_perl-x.xx.tar.gz
    % tar xzvf apache_x.x.x.tar.gz
    % tar xzvf mod_perl-x.xx.tar.gz
    % cd mod_perl-x.xx
    % perl Makefile.PL APACHE_SRC=../apache_x.x.x/src \
      DO_HTTPD=1 USE_APACI=1 EVERYTHING=1
    % make && make test && make install
    % cd ../apache_x.x.x
    % make install
  
  That's all!
  
  Notes: Replace x.xx and x.x.x with the real version numbers of
  mod_perl and Apache respectively.  The C<z> flag tells Gnu C<tar> to
  uncompress the archive as well as extract the files.  You might need
  superuser permissions to do the make install steps.
  
  =head2 Installation in 10 paragraphs
  
  If you have the C<lwp-download> utility installed, you can use it to
  download the sources of both packages:
  
    % lwp-download http://www.apache.org/dist/apache_x.x.x.tar.gz
    % lwp-download http://perl.apache.org/dist/mod_perl-x.xx.tar.gz
  
  C<lwp-download> is a part of the LWP module (from the C<libwww> package),
  and you will need to have it installed in order for mod_perl's C<make
  test> step to pass.
  
  Extract both sources.  Usually I open all the sources in I</usr/src/>,
  but your mileage may vary.  So move the sources and C<chdir> to the
  directory that you want to put the sources in.  If you have a non-GNU
  C<tar> utility it will be unable to decompress so you will have to unpack in
  two steps: first uncompress the packages with:
  
    gzip -d apache_x.x.x.tar.gz
    gzip -d mod_perl-x.xx.tar.gz
  
  then un-tar them with:
  
    tar xvf apache_x.x.x.tar 
    tar xvf mod_perl-x.xx.tar
  
  You can probably use gunzip instead of gzip -d if you prefer.
  
    % cd /usr/src
    % tar xzvf apache_x.x.x.tar.gz
    % tar xzvf mod_perl-x.xx.tar.gz
  
  C<chdir> to the mod_perl source directory:
  
    % cd mod_perl-x.xx
  
  Now build the Makefile. For your first installation and most basic
  work the parameters in the example below are the only ones you will
  need.  C<APACHE_SRC> tells the Makefile.PL where to find the Apache
  I<src> directory.  If you have followed my suggestion and have
  extracted both sources under the directory I</usr/src>, then issue
  the command:
  
    % perl Makefile.PL APACHE_SRC=../apache_x.x.x/src \
      DO_HTTPD=1 USE_APACI=1 EVERYTHING=1
  
  There are many additional optional parameters.  You can find some of
  them later in this section and in the L<Server 
Configuration|config/mod_perl_Configuration_>
  section.
  
  While running C<perl Makefile.PL ...> the process will check for
  prerequisites and tell you if something is missing.  If you are
  missing some of the perl packages or other software, you will have to
  install them before you proceed.
  
  Next C<make> the project.  The command C<make> builds the mod_perl
  extension and also calls C<make> in the Apache source directory to
  build C<httpd>.  Then we run the I<test> suite, and finally I<install>
  the mod_perl modules in their proper places.
  
    % make && make test && make install
  
  Note that if C<make> fails, neither C<make test> nor C<make install>
  will be executed.  If C<make test> fails, C<make install> will be not
  executed.
  
  Now change to the Apache source directory and run C<make install>.
  This will install Apache's headers, default configuration files, build
  the Apache directory tree and put C<httpd> in it.
  
    % cd ../apache_x.x.x
    % make install
  
  When you execute the above command, the Apache installation process
  will tell you how to start a freshly built webserver (you need to know
  the path of C<apachectl>, more about that later) and where to find the
  configuration files.  Write down both, since you will need this
  information very soon.  On my machine the two important paths are:
  
    /usr/local/apache/bin/apachectl
    /usr/local/apache/conf/httpd.conf
  
  Now the build and installation processes are complete.
  
  
  =head2 Configuration
  
  First, a simple configuration.  Configure Apache as you usually would
  (set C<Port>, C<User>, C<Group>, C<ErrorLog>, other file paths etc).
  
  Start the server and make sure it works, then shut it down.  The
  C<apachectl> utility can be used to start and stop the server:
  
    % /usr/local/apache/bin/apachectl start
    % /usr/local/apache/bin/apachectl stop
  
  Now we will configure Apache to run perl CGI scripts under the
  C<Apache::Registry> handler.
  
  You can put configuration directives in a separate file and tell
  I<httpd.conf> to include it, but for now we will simply add them to
  the main configuration file.  We will add the mod_perl configuration
  directives to the end of I<httpd.conf>.  In fact you can place them
  anywhere in the file, but they are easier to find at the end.
  
  For the moment we will assume that you will put all the scripts which
  you want to be executed by the mod_perl enabled server under the
  directory I</home/httpd/perl>.  We will alias this directory to the
  URI I</perl>
  
  Add the following configuration directives to I<httpd.conf>:
  
    Alias /perl/ /home/httpd/perl/
    
    PerlModule Apache::Registry
    <Location /perl>
      SetHandler perl-script
      PerlHandler Apache::Registry
      Options ExecCGI
      PerlSendHeader On
      allow from all
    </Location>
  
  Now create a four-line test script in I</home/httpd/perl/>:
  
    test.pl
    -------
    #!/usr/bin/perl -w
    use strict;
    print "Content-type: text/html\r\n\r\n";
    print "It worked!!!\n";
  
  Note that the server is probably running as a user with a restricted
  set of privileges, perhaps as user C<nobody> or C<www>.  Look for the
  C<User> directive in I<httpd.conf> to find the userid of the server.
  
  Make sure that you have read and execute permissions for I<test.pl>.
  
    % chmod u+rx /home/httpd/perl/test.pl
  
  Test that the script works from the command line, by executing it:
  
    % /home/httpd/perl/test.pl
  
  You should see:
  
    Content-type: text/html
    
    It worked!!!
  
  Assuming that the server's userid is C<nobody>, make the script owned
  by this user. We already made it executable and readable by user.
  
    % chown nobody /home/httpd/perl/test.pl
  
  Now it is time to test that mod_perl enabled Apache can execute the
  script.
  
  Start the server ('C<apachectl start>').  Check in I<logs/error_log>
  to see that the server has indeed started--verify the correct date and
  time of the log entry.
  
  To get Apache to execute the script we simply fetch its URI.  Assuming
  that your I<httpd.conf> has been configured with the directive C<Port
  80>, start your favorite browser and fetch the following URI:
  
    http://www.example.com/perl/test.pl
  
  If you have the loop-back device (127.0.0.1) configured, you can use
  the URI:
  
    http://localhost/perl/test.pl
  
  In either case, you should see:
  
    It worked!!!
  
  If your server is listening on a port other than 80, for example 8000,
  then fetch the URI:
  
    http://www.example.com:8000/perl/test.pl
  
  or whatever is appropriate.
  
  If something went wrong, go through the installation process again,
  and make sure you didn't make a mistake.  If that doesn't help, read
  the C<INSTALL> pod document (C<perlpod INSTALL>) in the mod_perl
  distribution directory.
  
  Now that your mod_perl server is working, copy some of your Perl CGI
  scripts into the directory I</home/httpd/perl/> or below it.
  
  If your programming techniques are good, chances are that your scripts
  will work with no modifications at all.  With the mod_perl enabled
  server you will see them working very much faster.
  
  If your programming techniques are sloppy, some of your scripts will
  not work and they may exhibit strange behaviour.  Depending on the
  degree of sloppiness they may need anything from minor tweaking to a
  major rewrite to make them work properly.  (See L<Sometimes My Script
  Works, Sometimes It Does Not|debug/Sometimes_My_Script_Works_Somet> )
  
  The above setup is very basic, but as with Perl, you can start to
  benefit from mod_perl from the very first moment you try it.  As you
  become more familiar with mod_perl you will want to start writing
  Apache handlers and make more use of its power.
  
  
  =head1 One Plain and One mod_perl enabled Apache Servers
  
  Since we are going to run two Apache servers we will need two complete
  (and different) sets of configuration, log and other files.  In this
  scenario we'll use a dedicated root directory for each server, which
  is a personal choice. You can choose to have both servers living under
  the same roof, but it might lead to a mess, since it requires a
  slightly more complicated configuration. This decision might be nice
  since you will be able to share some directories like I<include>
  (which contains Apache headers), but in fact this can become a problem
  later, when you decide to upgrade one server but not the other. You
  will have to solve this problem then, so why not to avoid it in first
  place.
  
  From now on we will refer to these two servers as B<httpd_docs> (plain
  Apache) and B<httpd_perl> (Apache/mod_perl).  We will use
  I</usr/local> as our I<root> directory.
  
  First let's prepare the sources.  We will assume that all the sources
  go into the I</usr/src> directory.  Since you will probably want to
  tune each copy of Apache separately, it is better to use two separate
  copies of the Apache source for this configuration.  For example you
  might want only the httpd_docs server to be built with the mod_rewrite
  module.
  
  Having two independent source trees will prove helpful unless you use
  dynamically shared objects (C<DSO>) which is covered later in this
  chapter.
  
  Make two subdirectories:
  
    % mkdir /usr/src/httpd_docs
    % mkdir /usr/src/httpd_perl
  
  Next put a set of the Apache sources into the I</usr/src/httpd_docs>
  directory (replace the directory I</tmp> with the path to the
  downloaded file and C<x.x.x> with the version of Apache that you have
  downloaded):
  
    % cd /usr/src/httpd_docs
    % gzip -dc /tmp/apache_x.x.x.tar.gz | tar xvf -
  
  or if you have GNU tar:
  
    % tar xvzf /tmp/apache_x.x.x.tar.gz
  
  Just to check we have extracted in the right way:
  
    % ls -l
    drwxr-xr-x  8 stas  stas 2048 Apr 29 17:38 apache_x.x.x/
  
  Now prepare the httpd_perl server sources:
  
    % cd /usr/src/httpd_perl
    % gzip -dc /tmp/apache_x.x.x.tar.gz | tar xvf -
    % gzip -dc /tmp/modperl-x.xx.tar.gz | tar xvf -
    
    % ls -l
    drwxr-xr-x  8 stas  stas 2048 Apr 29 17:38 apache_x.x.x/
    drwxr-xr-x  8 stas  stas 2048 Apr 29 17:38 modperl-x.xx/
  
  We are going to use a default Apache directory layout, and place each
  server directories under its dedicated directory. The two directories
  are as you have already guessed:
  
    /usr/local/httpd_perl/
    /usr/local/httpd_docs/
  
  The next step is to configure and compile the sources: Below are the
  procedures to compile both servers, using the directory layout I have
  just suggested.
  
  =head2 Configuration and Compilation of the Sources.
  
  As usual we use I<x.x.x> instead of real version numbers so this
  document will never become obsolete. But the most important thing --
  it's not misleading. It's quite possible that since the moment this
  document was written a new version has come out and you will be not
  aware of this fact if you will not check for it.
  
  =head3 Building the httpd_docs Server
  
  =over 8
  
  =item Sources Configuration:
  
    % cd /usr/src/httpd_docs/apache_x.x.x
    % make clean
    %  ./configure --prefix=/usr/local/httpd_docs \
       --enable-module=rewrite --enable-module=proxy
  
  We need the I<mod_rewrite> and I<mod_proxy> modules as we will see
  later, so we tell I<./configure> to build them in.
  
  You might want to add C<--layout> to see the resulting directories'
  layout without actually running the configuration process.
  
  =item Source Compilation and Installation
  
    % make
    % make install
  
  Rename C<httpd> to C<http_docs>:
  
    % mv /usr/local/httpd_docs/bin/httpd \
         /usr/local/httpd_docs/bin/httpd_docs
  
  Now modify the B<apachectl> utility to point to the renamed httpd via
  your favorite text editor or by using perl:
  
    % perl -pi -e 's|bin/httpd|bin/httpd_docs|' \
      /usr/local/httpd_docs/bin/apachectl
  
  =back
  
  Another approach would be to use the I<--target> option while
  configuring the source, which makes the last two commands unnecessary.
  
    % ./configure --prefix=/usr/local/httpd_docs \
       --target=httpd_docs \
       --enable-module=rewrite --enable-module=proxy
    % make && make install
  
  Since we told I<./configure> that we want the executable to be called
  C<httpd_docs> (via C<--target=httpd_docs>) -- it performs all the
  naming adjustment for us.
  
  The only thing that you might find unusual, is that I<apachectl> will
  be now called I<httpd_docsctl> and the configuration file
  I<httpd.conf> now will be called I<httpd_docs.conf>.
  
  We will leave the decision making about the preferred configuration
  and installation way to the reader. In the rest of the guide we will
  continue using the regular names resulted from using the standard
  configuration and the manual executable name adjustment as described
  at the beginning of this section .
  
  
  =head3 Building the httpd_perl Server
  
  Now we proceed with the sources configuration and installation of the
  I<httpd_perl> server. First make sure the sources are clean:
  
    % cd /usr/src/httpd_perl/apache_x.x.x
    % make clean
    % cd /usr/src/httpd_perl/mod_perl-x.xx
    % make clean
  
  It is important to B<make clean> since some of the versions are not
  binary compatible (e.g apache 1.3.3 vs 1.3.4) so any "third-party" C
  modules need to be re-compiled against the latest header files.
  
    % cd /usr/src/httpd_perl/mod_perl-x.xx
  
    % /usr/bin/perl Makefile.PL \
          APACHE_SRC=../apache_x.x.x/src \
        DO_HTTPD=1 USE_APACI=1 EVERYTHING=1 \
          APACHE_PREFIX=/usr/local/httpd_perl \
        APACI_ARGS='--prefix=/usr/local/httpd_perl'
  
  If you need to pass any other configuration options to Apache's
  C<configure>, add them after the I<--prefix> option. e.g:
  
        APACI_ARGS='--prefix=/usr/local/httpd_perl \
                      --enable-module=status'
  
  Notice that B<all> C<APACI_ARGS> (above) must be passed as one long
  line if you work with C<t?csh>!!!  However with C<(ba)?sh> it works
  correctly the way it is shown above, breaking the long lines with
  'C<\>'.  As of tcsh version 6.08.0, when it passes the C<APACI_ARGS>
  arguments to C<configure> it does not alter the newlines, but it
  strips the backslashes, thus breaking the configuration process.
  
  Notice that just like in httpd_docs configuration you can use
  C<--target=httpd_perl> instead of specifying each directory
  separately. Note that this option has to be the very last argument in
  C<APACI_ARGS>, otherwise 'make test' tries to run C<'httpd_perl'>,
  which fails.
  
  [META:
  It's very important to use the same compiler you build the perl with.
  See the section 'L<What Compiler Should Be Used to Build
  mod_perl|install/What_Compiler_Should_Be_Used_to_>' for more
  information.
  
  [META: --- Hmm, what's the option that overrides the compiler when
  building Apache from mod_perl. Check also whether mod_perl supplies
  the right compiler (the one used for building itself) -- if it does
  there is no need for the above note.  ]
  
  Now, build, test and install the C<httpd_perl>.
  
    % make && make test && make install
  
  Upon installation Apache puts a stripped version of C<httpd> at
  I</usr/local/httpd_perl/bin/httpd>.  The original version which
  includes debugging symbols (if you need to run a debugger on this
  executable) is located at
  I</usr/src/httpd_perl/apache_x.x.x/src/httpd>.
  
  You may have noticed that we did not run C<make install> in the Apache
  source directory.  When C<USE_APACI> is enabled, C<APACHE_PREFIX> will
  specify the C<--prefix> option for Apache's C<configure> utility,
  which gives the installation path for Apache.  When this option is
  used, mod_perl's C<make install> will also C<make install> for Apache,
  installing the httpd binary, the support tools, and the configuration,
  log and document trees. If this option is not used you will have to
  explicitly run make install in the Apache source directory.
  
  If C<make test> fails, look into
  I</usr/src/httpd_perl/mod_perl-x.xx/t/logs> and read the error.log
  file.  Also see L<make test fails|install/make_test_fails>.
  
  While doing C<perl Makefile.PL ...> mod_perl might complain by warning
  you about a missing library C<libgdbm>. This is a crucial warning. See
  L<Missing or Misconfigured
  libgdbm.so|install/Missing_or_Misconfigured_libgdbm> for more info.
  
  Now rename C<httpd> to C<httpd_perl>:
  
    % mv /usr/local/httpd_perl/bin/httpd \
      /usr/local/httpd_perl/bin/httpd_perl
  
  Update the I<apachectl> utility to drive the renamed httpd:
  
    % perl -p -i -e 's|bin/httpd|bin/httpd_perl|' \
      /usr/local/httpd_perl/bin/apachectl
  
  
  
  
  
  
  =head2 Configuration of the servers
  
  Now when we have completed the building process, the last stage before
  running the servers is to configure them.
  
  =head3 Basic httpd_docs Server Configuration
  
  Configuring of the C<httpd_docs> server is a very easy task.  Starting
  from version 1.3.4 of Apache, there is only one file to edit.  Open
  I</usr/local/httpd_docs/conf/httpd.conf> in your favorite text editor
  and configure it as you usually would, except make sure that you
  configure the log file directory (I</usr/local/httpd_docs/logs>
  and so on) and the other paths according to the layout you have
  decided to use.
  
  Start the server with:
  
    /usr/local/httpd_docs/bin/apachectl start
  
  
  =head3 Basic httpd_perl Server Configuration
  
  Edit the I</usr/local/httpd_perl/conf/httpd.conf>.  As with the
  C<httpd_docs> server configuration, make sure that C<ErrorLog> and
  other file location directives are set to point to the right places,
  according to the chosen directory layout.
  
  The first thing to do is to set a C<Port> directive - it should be
  different from that used by the plain Apache server (C<Port 80>) since
  we cannot bind two servers to the same port number on the same
  machine.  Here we will use C<8080>.  Some developers use port C<81>,
  but you can bind to ports below 1024 only if the server has root
  permissions.  If you are running on a multiuser machine, there is a
  chance that someone already uses that port, or will start using it in the
  future, which could cause problems.  If you are the only user on your
  machine, basically you can pick any unused port number.  Many
  organizations use firewalls which may block some of the ports, so port
  number choice can be a controversial topic.  From my experience the
  most popular port numbers are: C<80>, C<81>, C<8000> and C<8080>.
  Personally, I prefer the port C<8080>.  Of course with the two server
  scenario you can hide the nonstandard port number from firewalls and
  users, by using either mod_proxy's C<ProxyPass> directive or a proxy
  server like Squid.
  
  For more details see L<Publishing Port Numbers other than
  80|config/Publishing_Port_Numbers_other_th>, L<Running One Webserver
  and Squid in httpd Accelerator
  Mode|scenario/Running_One_Webserver_and_Squid_>, L<Running Two
  Webservers and Squid in httpd Accelerator
  Mode|scenario/Running_Two_webservers_and_Squid> and L<Using
  mod_proxy|scenario/mod_proxy>.
  
  Now we proceed to the mod_perl specific directives.  It will be a good
  idea to add them all at the end of C<httpd.conf>, since you are going
  to fiddle with them a lot in the early stages.
  
  First, you need to specify the location where all mod_perl scripts
  will be located.
  
  Add the following configuration directive:
  
      # mod_perl scripts will be called from
    Alias /perl/ /usr/local/httpd_perl/perl/
  
  From now on, all requests for URIs starting with I</perl> will be
  executed under mod_perl and will be mapped to the files in
  I</usr/local/httpd_perl/perl/>.
  
  Now we configure the I</perl> location.
  
    PerlModule Apache::Registry
  
    <Location /perl>
      #AllowOverride None
      SetHandler perl-script
      PerlHandler Apache::Registry
      Options ExecCGI
      allow from all
      PerlSendHeader On
    </Location>
  
  This configuration causes any script that is called with a path
  prefixed with I</perl> to be executed under the C<Apache::Registry>
  module and as a CGI (hence the C<ExecCGI>--if you omit this option the
  script will be printed to the user's browser as plain text or will
  possibly trigger a 'B<Save-As>' window).  The C<Apache::Registry>
  module lets you run your (carefully written) Perl CGI scripts
  virtually unchanged under mod_perl.  The C<PerlModule> directive is
  the equivalent of Perl's require().  We load the C<Apache::Registry>
  module before we use it by giving the C<PerlHandler>
  C<Apache::Registry> directive.
  
  C<PerlSendHeader On> tells the server to send an HTTP header to the
  browser on every script invocation. You will want to turn this off for
  nph (non-parsed-headers) scripts.
  
  This is only a very basic configuration.  The L<Server
  Configuration|config/> section covers the rest of the details.
  
  Now start the server with:
  
    /usr/local/httpd_perl/bin/apachectl start
  
  
  =head1 Running Two webservers and Squid in httpd Accelerator Mode
  
  While I have detailed the mod_perl server installation, you are on
  your own with installing the Squid server (See L<Getting
  Helped|help/> for more details).
  I run Linux, so I downloaded the RPM package, installed it, configured
  the I</etc/squid/squid.conf>, fired off the server and all was set.
  
  Basically once you have Squid installed, you just need to modify the
  default C<squid.conf> as I will explain below, then you are ready to
  run it.
  
  The configuration that I'm going to present works with Squid server
  version 2.3.STABLE2. It's possible that some directives will change in
  future versions.
  
  First, let's take a look at what we have already running and what we
  want from squid.
  
  Previously we have had the C<httpd_docs> and C<httpd_perl> servers
  listening on ports 80 and 8080.  Now we want squid to listen on port
  80, to forward requests for static objects (plain HTML pages, images
  and so on) to the port which the httpd_docs server listens to, and
  dynamic requests to httpd_perl's port.  And of course collecting the
  generated responses, which will be delivered to the client by
  Squid. As mentioned before this mode is known as I<httpd-accelerator>
  mode in proxy dialect.
  
  Therefore we have to reconfigure the httpd_docs server to listen to
  port 81 instead, since port 80 will be taken by Squid.  Remember that
  in our scenario both copies of Apache will reside on the same machine
  as Squid.
  
  A proxy server makes all the magic behind it transparent to users.
  Both Apache servers return the data to Squid (unless it was already
  cached by Squid).  The client never sees the other ports and never
  knows that there might be more than one server running.  Do not
  confuse this scenario with C<mod_rewrite>, where a server redirects
  the request somewhere according to the rewrite rules and forgets all
  about it.  (i.e. works as a one way dispatcher, which dispatches the
  jobs but is not responsible for.)
  
  Squid can be used as a straightforward proxy server.  ISPs and other
  companies generally use it to cut down the incoming traffic by caching
  the most popular requests.  However we want to run it in C<httpd
  accelerator mode>.  Two directives (C<httpd_accel_host> and
  C<httpd_accel_port>) enable this mode.  We will see more details
  shortly.  
  
  If you are currently using Squid in the regular proxy mode, you can
  extend its functionality by running both modes concurrently.  To
  accomplish this, you can extend the existing Squid configuration with
  B<httpd accelerator mode>'s related directives or you can just create
  one from scratch.
  
  Let's go through the changes we should make to the default
  configuration file.  Since the file with default settings
  (I</etc/squid/squid.conf>) is huge (about 60KB) and we will not alter
  95% of its default settings, my suggestion is to write a new one
  including only the modified directives.
  
  We want to enable the redirect feature, to be able to serve requests
  by more than one server (in our case we have two: the httpd_docs and
  httpd_perl servers).  So we specify C<httpd_accel_host> as virtual.
  This assumes that your server has multiple interfaces - Squid will
  bind to all of them.
  
    httpd_accel_host virtual
  
  Then we define the default port the requests will be sent to, unless
  redirected.  We assume that most requests will be for static documents
  (also it's easier to define redirect rules for the mod_perl server
  because of the URI that starts with I<perl> or similar).  We have our
  httpd_docs listening on port 81:
  
    httpd_accel_port 81
  
  And as described before, squid listens to port 80.
  
    http_port 80
  
  We do not use C<icp> (C<icp> is used for cache sharing between
  neighboring machines, which is more relevant in the proxy mode).
  
    icp_port 0
  
  C<hierarchy_stoplist> defines a list of words which, if found in a
  URL, causes the object to be handled directly by the cache. Since we
  told Squid in the previous directive that we aren't going to share the
  cache between neighboring machines this directive is irrelevant. In
  case that you do use this feature, make sure to set this directive to
  something like:
  
    hierarchy_stoplist /cgi-bin /perl
  
  where the I</cgi-bin> and I</perl> are aliases for the locations which
  handle the dynamic requests.
  
  Now we tell Squid not to cache dynamically generated pages.
  
    acl QUERY urlpath_regex /cgi-bin /perl
    no_cache deny QUERY
  
  Please note that the last two directives are controversial ones.  If
  you want your scripts to be more compliant with the HTTP standards,
  according to the HTTP specification the headers of your scripts should
  carry the I<Caching Directives>: C<Last-Modified> and C<Expires>.
  
  What are they for? If you set the headers correctly, there is no need
  to tell the Squid accelerator B<NOT> to try to cache anything.  Squid
  will not bother your mod_perl servers a second time if a request is
  (a) cacheable and (b) still in the cache.  Many mod_perl applications
  will produce identical results on identical requests if not much time
  has elapsed between the requests.  So your Squid might have a hit
  ratio of 50%, which means that the mod_perl servers will have only
  half as much work to do as they did before you installed Squid (or
  mod_proxy).
  
  Even if you insert a user-ID and date in your page, caching can save
  resources when you set the expiration time to 1 second.  A user might
  double click where a single click would do, thus sending two requests
  in parallel.  Squid could serve the second request.
  
  But this is only possible if you set the headers correctly. Refer to
  the chapter L<Correct Headers - A quick guide for mod_perl
  users|correct_headers/> to learn more about generating the proper
  caching headers under mod_perl. In case where only the scripts under
  I</perl/caching-unfriendly> are not I<caching friendly> fix the above
  setting to be:
  
    acl QUERY urlpath_regex /cgi-bin /perl/caching-unfriendly
    no_cache deny QUERY
  
  But if you are lazy, or just have too many things to deal with, you
  can leave the above directives the way we described.  Just keep in
  mind that one day you will want to reread this section and L<the
  headers generation tutorial|correct_headers/> to squeeze even more
  power from your servers without investing money in more memory and
  better hardware.
  
  While testing you might want to enable the debugging options and watch
  the log files in the directory I</var/log/squid/>.  But make sure to
  turn debugging off in your production server.  Below we show it
  commented out, which makes it disabled, since it's disabled by
  default.  Debug option 28 enables the debugging of the access control
  routes, for other debug codes see the documentation embedded in the
  default configuration file that comes with squid.
  
    # debug_options 28
  
  We need to provide a way for Squid to dispatch requests to the correct
  servers.  Static object requests should be redirected to httpd_docs
  unless they are already cached, while requests for dynamic documents
  should go to the httpd_perl server.  The configuration below tells
  Squid to fire off 10 redirect daemons at the specified path of the
  redirect daemon and (as suggested by Squid's documentation) disables
  rewriting of any C<Host:> headers in redirected requests.  The
  redirection daemon script is listed below.
  
    redirect_program /usr/lib/squid/redirect.pl
    redirect_children 10
    redirect_rewrites_host_header off
  
  The maximum allowed request size is in kilobytes, which is mainly
  useful during C<PUT> and C<POST> requests. A user who attempts to send
  a request with a body larger than this limit receives an "Invalid
  Request" error message.  If you set this parameter to a zero, there
  will be no limit imposed. If you are using C<POST> to upload files,
  then set this to the largest file's size plus a few extra KB.
  
    request_body_max_size 1000 KB
  
  Then we have access permissions, which we will not explain.  You might
  want to read the documentation, so as to avoid any security problems.
  
    acl all src 0.0.0.0/0.0.0.0
    acl manager proto cache_object
    acl localhost src 127.0.0.1/255.255.255.255
    acl myserver src 127.0.0.1/255.255.255.255
    acl SSL_ports port 443 563
    acl Safe_ports port 80 81 8080 81 443 563
    acl CONNECT method CONNECT
    
    http_access allow manager localhost
    http_access allow manager myserver
    http_access deny manager
    http_access deny !Safe_ports
    http_access deny CONNECT !SSL_ports
    # http_access allow all
  
  Since Squid should be run as a non-root user, you need these if you
  are invoking the Squid as root. The user I<squid> is created when the
  Squid server is installed.
  
    cache_effective_user squid
    cache_effective_group squid
  
  Now configure a memory size to be used for caching.  The Squid
  documentation warns that the actual size of Squid can grow to be three
  times larger than the value you set.
  
    cache_mem 20 MB
  
  We want to keep pools of allocated (but unused) memory available for
  future use if we have the memory available of course. Otherwise turn
  it off.
  
    memory_pools on
  
  Now tighten the runtime permissions of the cache manager CGI script
  (C<cachemgr.cgi>, which comes bundled with squid) on your production
  server.
  
    cachemgr_passwd disable shutdown
  
  If you are not using this script to manage the Squid server from
  remote, you should disable it:
  
    cachemgr_passwd disable all
  
  Now the redirection daemon script (you should put it at the location
  you have specified in the C<redirect_program> parameter in the config
  file above, and make it executable by the webserver of course):
  
    #!/usr/local/bin/perl -p
    BEGIN{ $|=1 }
    s|www.example.com(?::81)?/perl/|www.example.com:8080/perl/|o ;
  
  Here is what the regular expression from above does; it matches all
  the URIs that include either the string I<www.example.com/perl/> or
  the string I<www.example.com:81/perl/> and replaces either of these
  strings with I<www.example.com:8080/perl>. No matter whether the
  regular expression worked or not, the C<$_> variable is automagically
  printed.
  
  We can write the above code as the following code as well:
  
    #!/usr/local/bin/perl
    
    $|=1;
    
    while (<>) {
        # redirect to mod_perl server (httpd_perl)
      print($_), next 
        if s|www.example.com(:81)?/perl/|www.example.com:8080/perl/|o;
    
        # send it unchanged to plain apache server (http_docs)
      print;
    }
  
  The above redirector can be more complex of course, but you know Perl,
  right?
  
  A few notes regarding the redirector script:
  
  You must disable buffering.  C<$|=1;> does the job.  If you do not
  disable buffering, C<STDOUT> will be flushed only when its buffer
  becomes full--and its default size is about 4096 characters.  So if
  you have an average URL of 70 chars, only after about 59 (4096/70)
  requests will the buffer be flushed, and the requests will finally
  reach the server.  Your users will not wait that long, unless you have
  hundreds requests per second and then the buffer will be flushed very
  frequently because it'll get full very fast.
  
  If you think that this is a very ineffective way to redirect, you
  should consider the following explanation.  The redirector runs as a
  daemon, it fires up N redirect daemons, so there is no problem with
  Perl interpreter loading.  Exactly as with mod_perl, the perl
  interpreter is loaded all the time in memory and the code has already
  been compiled, so the redirect is very fast (not much slower than if
  the redirector was written in C).  Squid keeps an open pipe to each
  redirect daemon, thus there is no overhead of the system calls.
  
  Now it is time to restart the server, at linux I do it with:
  
    /etc/rc.d/init.d/squid restart
  
  Now the Squid server setup is complete.
  
  Almost... When you try the new setup, you will be surprised and upset
  to discover port 81 showing up in the URLs of the static objects (like
  htmls).  Hey, we did not want the user to see the port 81 and use it
  instead of 80, since then it will bypass the squid server and the hard
  work we went through was just a waste of time!
  
  The solution is to make both squid and httpd_docs listen to the same
  port.  This can be accomplished by binding each one to a specific
  interface (so they are listening to different B<sockets>).  Modify
  I<httpd_docs/conf/httpd.conf>:
  
    Port 80
    BindAddress 127.0.0.1
    Listen 127.0.0.1:80
  
  Now the httpd_docs server is listening only to requests coming from
  the local server. You cannot access it directly from the
  outside. Squid becomes a gateway that all the packets go through on
  the way to the httpd_docs server.
  
  Modify I<squid.conf>:
  
    http_port 80
    tcp_outgoing_address 127.0.0.1
    httpd_accel_host 127.0.0.1
    httpd_accel_port 80
  
  Now restart the Squid and httpd_docs servers (it doesn't matter which
  one you start first), and voila--the port number has gone.
  
  You must also have in the file I</etc/hosts> the following entry
  (chances are that it's already there):
  
    127.0.0.1 localhost.localdomain localhost
  
  Now if your scripts are generating HTML including fully qualified self
  references, using 8080 or the other port, you should fix them to
  generate links to point to port 80 (which means not using the port at
  all in the URI).  If you do not do this, users will bypass Squid and
  will make direct requests to the mod_perl server's port. As we will
  see later just like with httpd_docs, the httpd_perl server can be
  configured to listen only to requests coming from the localhost (with
  Squid forwarding these requests from the outside) and therefore users
  will not be able to bypass Squid.
  
  To save you some keystrokes, here is the whole modified C<squid.conf>:
  
    http_port 80
    tcp_outgoing_address 127.0.0.1
    httpd_accel_host 127.0.0.1
    httpd_accel_port 80
    
    icp_port 0
    
    acl QUERY urlpath_regex /cgi-bin /perl
    no_cache deny QUERY
    
    # debug_options 28
    
    redirect_program /usr/lib/squid/redirect.pl
    redirect_children 10
    redirect_rewrites_host_header off
    
    request_body_max_size 1000 KB
    
    acl all src 0.0.0.0/0.0.0.0
    acl manager proto cache_object
    acl localhost src 127.0.0.1/255.255.255.255
    acl myserver src 127.0.0.1/255.255.255.255
    acl SSL_ports port 443 563
    acl Safe_ports port 80 81 8080 81 443 563
    acl CONNECT method CONNECT
    
    http_access allow manager localhost
    http_access allow manager myserver
    http_access deny manager
    http_access deny !Safe_ports
    http_access deny CONNECT !SSL_ports
    # http_access allow all
    
    cache_effective_user squid
    cache_effective_group squid
    
    cache_mem 20 MB
    
    memory_pools on
    
    cachemgr_passwd disable shutdown
  
  Note that all directives should start at the beginning of the line, so
  if you cut and paste from the text make sure you remove the leading
  whitespace from each line.
  
  =head1 Running One Webserver and Squid in httpd Accelerator Mode
  
  When I was first told about Squid, I thought: "Hey, now I can drop the
  C<httpd_docs> server and have just Squid and the C<httpd_perl>
  servers".  Since all my static objects will be cached by squid, I do
  not need the light C<httpd_docs> server.
  
  But I was a wrong.  Why?  Because I still have the overhead of loading
  the objects into Squid the first time.  If a site has many of them,
  unless a huge chunk of memory is devoted to Squid they won't all be
  cached and the heavy mod_perl server will still have the task of
  serving static objects.
  
  How do we measure the overhead?  The difference between the two
  servers is in memory consumption, everything else (e.g. I/O) should be
  equal.  So you have to estimate the time needed to fetch each static
  object for the first time at a peak period and thus the number of
  additional servers you need for serving the static objects.  This will
  allow you to calculate the additional memory requirements.  I imagine
  that this amount could be significant in some installations.
  
  So on for production servers I have decided to stick with the
  Squid, httpd_docs and httpd_perl scenario, where I can optimize and
  fine tune everything. But if in your case there is almost no static
  objects to serve, the httpd_docs server is definitely redundant. And
  all you need are the mod_perl server and Squid to buffer the output
  from it.
  
  If you want to proceed with this setup, install mod_perl enabled
  Apache and Squid.  Then use a configuration similar to the previous
  section, but now httpd_docs is not there anymore.  Also we do not need
  the redirector anymore and we specify C<httpd_accel_host> as a name of
  the server and not C<virtual>.  Because we do not redirect there is no
  need to bind two servers on the same port so there are neither C<Bind>
  nor C<Listen> directives in I<httpd.conf>.
  
  The modified configuration for this simplified setup (see the
  explanations in the previous section):
  
    httpd_accel_host put.your.hostname.here
    httpd_accel_port 8080
    http_port 80
    icp_port 0
    
    acl QUERY urlpath_regex /cgi-bin /perl
    no_cache deny QUERY
    
    # debug_options 28
    
    # redirect_program /usr/lib/squid/redirect.pl
    # redirect_children 10
    # redirect_rewrites_host_header off
    
    request_body_max_size 1000 KB
    
    acl all src 0.0.0.0/0.0.0.0
    acl manager proto cache_object
    acl localhost src 127.0.0.1/255.255.255.255
    acl myserver src 127.0.0.1/255.255.255.255
    acl SSL_ports port 443 563
    acl Safe_ports port 80 81 8080 81 443 563
    acl CONNECT method CONNECT
    
    http_access allow manager localhost
    http_access allow manager myserver
    http_access deny manager
    http_access deny !Safe_ports
    http_access deny CONNECT !SSL_ports
    # http_access allow all
    
    cache_effective_user squid
    cache_effective_group squid
    
    cache_mem 20 MB
    
    memory_pools on
    
    cachemgr_passwd disable shutdown
  
  
  
  
  =head1 mod_proxy
  
  mod_proxy implements a proxy/cache for Apache.  It implements proxying
  capability for FTP, CONNECT (for SSL), HTTP/0.9, and HTTP/1.0.  The
  module can be configured to connect to other proxy modules for these
  and other protocols.
  
  =head2 Concepts and Configuration Directives
  
  In the following explanation, we will use I<www.example.com> as the
  main server users access when they want to get some kind of service
  and I<backend.example.com> as a machine that does the heavy work.  The
  main and the back-end are different servers, they may or may not
  coexist on the same machine.
  
  The mod_proxy module is built into the server that answers requests to
  the I<www.example.com> hostname.  For the matter of this discussion
  it doesn't matter what functionality is built into the
  I<backend.example.com> server, obviously it'll be mod_perl for most
  of us.
  
  =head3 ProxyPass
  
  You can use the C<ProxyPass> configuration directive for mapping remote
  hosts into the space of the local server; the local server does not
  act as a proxy in the conventional sense, but appears to be a mirror
  of the remote server.
  
  Let's explore what this rule does:
  
    ProxyPass   /modperl/ http://backend.example.com/modperl/
  
  When a user initiates a request to
  http://www.example.com/modperl/foo.pl, the request will be redirected
  to http://backend.example.com/modperl/foo.pl, and starting from this
  moment user will see http://backend.example.com/ in her location
  window, instead of http://www.example.com/.
  
  You have probably noticed many examples of this from real life
  Internet sites you've visited.  Free-email service providers and other
  similar heavy online services display the login or the main page from
  their main server, and then when you log-in you see something like
  I<x11.example.com>, then I<w59.example.com>, etc.  These are the
  back-end servers that do the actual work.
  
  Obviously this is not an ideal solution, but usually users don't
  really care about what they see in the location window.  So you can
  get away with this approach.  As I'll show in a minute there is a
  better solution which removes this caveat and provides even more
  useful functionalities.
  
  =head3 ProxyPassReverse
  
  This directive lets Apache adjust the URL in the C<Location> header on
  HTTP redirect responses.  This is essential for example, when Apache
  is used as a reverse proxy to avoid by-passing the reverse proxy
  because of HTTP redirects on the back-end servers which stay behind the
  reverse proxy.  Generally used in conjunction with the C<ProxyPass>
  directive to build a complete front-end proxy server.
  
    ProxyPass          /modperl/  http://backend.example.com/modperl/
    ProxyPassReverse   /modperl/  http://backend.example.com/modperl/
  
  When a user initiates a request to
  http://www.example.com/modperl/foo.pl, the request will be redirected
  to http://backend.example.com/modperl/foo.pl but on the way back
  C<ProxyPassReverse> will correct the location URL to become
  http://www.example.com/modperl/foo.pl .  This happens completely
  transparently. The end user will never know that something has
  happened to his request behind the scenes.
  
  Note that this C<ProxyPassReverse> directive can also be used in
  conjunction with the proxy pass-through feature:
  
    RewriteRule ... [P]
  
  from mod_rewrite because its doesn't depend on a corresponding
  C<ProxyPass> directive.
  
  =head3 Security Issues
  
  Whenever you use mod_proxy you need to make sure that your server will
  not become a proxy for free riders. Allowing clients to issue proxy
  requests is controlled by the C<ProxyRequests> directive. Its default
  setting is C<off>, which means proxy requests are handled only if
  generated internally (by C<ProxyPass> or C<RewriteRule...[P]>
  directives.) Do not use the C<ProxyRequests> directive on your
  reverse proxy servers.
  
  
  
  =head2 Buffering Feature
  
  In addition to correcting the URI on its way back from the back-end
  server, mod_proxy also provides buffering services which mod_perl and
  similar heavy modules benefit from. The buffering feature allows
  mod_perl to pass the generated data to mod_proxy and move on to serve
  new requests, instead of waiting for a possibly slow client to receive
  all the data.
  
  This figure depicts this feature:
  
                  [socket]                   wire     `o'
    [mod_perl] => [      ] => [mod_proxy] => ____ =>  /|\
                  [buffer]                            / \
  
  From looking at this figure it's easy to see that the bottleneck is
  the socket buffer; it has to be able to absorb all the data that
  mod_perl has generated in order to free the mod_perl process
  immediately; mod_proxy will take the data as fast as mod_perl can
  deliver it, freeing the mod_perl server to service new requests as
  soon as possible while mod_proxy feeds the client at whatever rate the
  client requires.
  
  C<ProxyReceiveBufferSize> is the name of the parameter that specifies
  the size of the socket buffer. Configuring:
  
    ProxyReceiveBufferSize 16384
  
  will create a buffer of 16KB in size. If mod_perl generates output
  which is less than 16KB, the process will be immediately untied and
  allowed to serve new requests, if the output is bigger than 16KB, the
  following process will take place:
  
  =over
  
  =item 1
  
  The first 16KB will enter the system buffer.
  
  =item 2
  
  mod_proxy picks the first 8KB and sends it down the wire.
  
  =item 3
  
  mod_perl writes the next 8KB into the place of the 8KB of data that was
  just sent off by mod_proxy.
  
  =back
  
  Stages 2 and 3 are repeated until mod_perl has no more data to send.
  When this happens, mod_perl can serve a new request while stage 2 is
  repeated until all the data was picked from the system buffer and sent
  down the wire.
  
  Of course you want to set the buffer size as large as possible, since
  you want the heavy mod_perl processes to be utilized in the most
  efficient way, so you don't want them to waste their time waiting for
  a client to receive the data, especially if a client has a slow
  downstream connection.
  
  As the C<ProxyReceiveBufferSize> name states, its buffering feature
  applies only to I<downstream data> (coming from the origin server to
  the proxy) and not upstream data.  There is no buffering of data
  uploaded from the client browser to the proxy, thus you cannot use
  this technique to prevent the heavy mod_perl server from being tied up
  during a large POST such as a file upload.  Falling back to mod_cgi
  seems to be the best solution for these specific scripts whose major
  function is receiving large amounts of upstream data.
  
  
  [META: check this: --]
  
  Of course just like mod_perl, mod_proxy writes the data it
  proxy-passes into its outgoing socket buffer, therefore the mod_proxy
  process gets released as soon as the last chunk of data is deposited
  into this buffer, even if the client didn't complete the download. Its
  the OS's problem to complete the transfer and release the TCP socket
  used for this transfer.
  
  Therefore if you don't use mod_proxy and mod_perl sends its data
  directly to the client, and you have a big socket buffer, the mod_perl
  process will be released as soon as the last chunk of data enters the
  buffer. Just like with mod_proxy, the OS will deal with completing the
  data transfer.
  
  [based on this comment]
  yes, too (but receive and transmit buffer may be of different size,
  depending on the OS)
  
  The problem I don't know is, does the call to close the socket wait,
  until all data is actually send successfully or not. If it doesn't
  wait, you may not be noticed of any failure, but because the proxying
  Apache can write as fast to the socket transmission buffer as it can
  read, it should be possible that the proxying Apache copies all the
  data from the receive to the transmission buffer and after that
  releasing the receive buffer, so the mod_perl Apache is free to do
  other things, while the proxying Apache still wait until the client
  returns the success of data transmission. (The last, is the part I am
  not sure on)
  
  
  [/META]
  
  Unfortunately you cannot set the socket buffer size as large as you
  want because there is a limit of the available physical memory and OSs
  have their own upper limits on the possible buffer size.
  
  This doesn't mean that you cannot change the OS imposed limits, but to
  do that you have to know the techniques for doing that. In the next
  section we will present a few OSs and the ways to increase their
  socket buffer sizes.
  
  To increase the physical memory limits you just have to add more memory.
  
  =head2 Setting the Buffering Limits on Various OSs
  
  As we just saw there are a few kinds of parameters we might want to
  adjust for our needs.
  
  =head3 IOBUFSIZE Source Code Definition
  
  The first parameter is used by I<proxy_util.c:ap_proxy_send_fb()> to
  loop over content being proxy passed in 8KB chunks (as of this
  writing), passing that on to the client. In other words it specifies
  the size of the data that is sent down the wire.
  
  This parameter is defined by the C<IOBUFSIZE>:
  
    #define IOBUFSIZE 8192
  
  You have no control over this setting in the server configuration
  file, therefore you might want to change it in the source files,
  before you compile the server.
  
  =head3 ProxyReceiveBufferSize Configuration Directive
  
  You can control the socket buffer size with the
  C<ProxyReceiveBufferSize> directive:
  
    ProxyReceiveBufferSize 16384
  
  The above setting will set a buffer size of 16KB.  If it is not set
  explicitly, or if it is set to 0, then the default buffer size is
  used.  The number should be an integral multiple of 512.
  
  Note that if you set the value of C<ProxyReceiveBufferSize> larger
  than the OS limit, the default value will be used.
  
  Both the default and the maximum possible value of
  C<ProxyReceiveBufferSize> depend on the Operating System.
  
  =over
  
  =item * Linux
  
  For 2.2 kernels the maximum limit is in I</proc/sys/net/core/rmem_max>
  and the default value is in I</proc/sys/net/core/rmem_default>.  If
  you want to increase C<RCVBUF> size above 65535, the default maximum
  value, you have to raise first the absolute limit in
  I</proc/sys/net/core/rmem_max>. To do that at the run time, execute
  this command to raise it to 128KB:
  
    % echo 131072 > /proc/sys/net/core/rmem_max
  
  You probably want to put this command into I</etc/rc.d/rc.local> so
  the change will take effect at system reboot.
  
  On Linux OS with kernel 2.2.5 the maximum and default values are
  either 32KB or 64KB. You can also change the default and maximum
  values during kernel compilation; for that you should alter the
  C<SK_RMEM_DEFAULT> and C<SK_RMEM_MAX> definitions respectively.
  (Since kernel source files tend to change, use grep(1) utility to find
  the files.)
  
  =item * FreeBSD
  
  Under FreeBSD it's possible to configure the kernel to have bigger
  socket buffers:
  
   % sysctl -w kern.ipc.maxsockbuf=2621440
  
  =item * Solaris
  
  Under Solaris this upper limit is specified by I<tcp_max_buf>
  parameter and is 256KB.
  
  =item * Other OSs
  
  [ReaderMeta]: If you use an OS that is not listed here and know how to
  increase the socket buffer size please let me know.
  
  =back
  
  When you tell the kernel to use bigger sockets you can set bigger
  values for I<ProxyReceiveBufferSize>. e.g. 1048576 (1MB).
  
  
  =head3 Hacking the Code
  
  Some folks have patched the Apache's 1.3.x source code to make the
  application buffer configurable as well. After the patch there are two
  configuration directives available:
  
  =over
  
  =item *
  
  ProxyReceiveBufferSize -- sets the socket buffer size
  
  =item *
  
  ProxyInternalBufferSize -- sets the application buffer size
  
  =back
  
  To patch the source, rename ap_breate() to ap_bcreate_size() and add a
  size parameter, which defaults to I<IOBUFSIZE> if 0 is passed. Then
  add
  
    #define ap_bcreate(p,flags) ap_bcreate(p,flags,0)
  
  and add a new ap_bcreate() which calls ap_bcreate_size() for binary
  compatibility.
  
  Actually the C<ProxyReceiveBufferSize> should be called
  C<ProxySocketBufferSize>.  This would also remove some of the
  confusion about what it actually does.
  
  
  
  =head2 Caching Feature
  
  META: complete the conf details
  
  Apache does caching as well.  It's relevant to mod_perl only if you
  produce proper headers, so your scripts' output can be cached.  See
  the Apache documentation for more details on the configuration of this
  capability.
  
  
  
  =head2 Build Process
  
  To build mod_proxy into Apache just add I<--enable-module=proxy>
  during the Apache B<./configure> stage. Since you probably will need
  the mod_rewrite capability enable it as well with
  I<--enable-module=rewrite>.
  
  
  
  =head1 Front-end Back-end Proxying with Virtual Hosts
  
  This section explains a configuration setup for proxying your back-end
  mod_perl servers when you need to use Virtual Hosts.
  
  The term I<Virtual Host> refers to the practice of maintaining more
  than one server on one machine, as differentiated by their apparent
  hostname. For example, it is often desirable for companies sharing a
  web server to have their own domains, with web servers accessible as
  I<www.company1.com> and I<www.company2.com>, without requiring the
  user to know any extra path information.
  
  The approach is to use a unique port number for each virtual host at
  the back-end server, so you can redirect from the front-end server to
  localhost:1234, and name-based virtual servers on the front end,
  though any technique on the front-end will do.
  
  If you run the front-end and the back-end servers on the same machine
  you can prevent any direct outside connections to the back-end server
  if you bind tightly to address C<127.0.0.1> (I<localhost>) as you will
  see in the following configuration example.
  
  The front-end (light) server configuration:
  
    <VirtualHost 10.10.10.10>
      ServerName www.example.com
      ServerAlias example.com
      RewriteEngine On
      RewriteOptions 'inherit'
      RewriteRule \.(gif|jpg|png|txt|html)$ - [last]
      RewriteRule ^/(.*)$ http://localhost:4077/$1 [proxy]
    </VirtualHost>
  
    <VirtualHost 10.10.10.10>
      ServerName foo.example.com
      RewriteEngine On
      RewriteOptions 'inherit'
      RewriteRule \.(gif|jpg|png|txt|html)$ - [last]
      RewriteRule ^/(.*)$ http://localhost:4078/$1 [proxy]
    </VirtualHost>
  
  The above front-end configuration handles two virtual hosts:
  I<www.example.com> and I<foo.example.com>. The two setups are almost
  identical.
  
  The front-end server will handle files with the extensions I<.gif>,
  I<.jpg>, I<.png>, I<.txt> and I<.html> internally, the rest will be
  proxied to be handled by the back-end server.
  
  The only difference between the two virtual hosts settings is that the
  former rewrites requests to port C<4077> at the back-end machine
  and the latter to port C<4078>.
  
  If your server is configured to run traditional CGI scripts (under
  mod_cgi) as well as mod_perl CGI programs, then it would be beneficial
  to configure the front-end server to run the traditional CGI scripts
  directly.  This can be done by altering the C<gif|jpg|png|txt>
  I<Rewrite> rule to add C<|cgi> at the end if all your mod_cgi scripts
  have the I<.cgi> extension, or adding a new rule to handle all
  C</cgi-bin/*> locations locally.
  
  The back-end (heavy) server configuration:
  
    Port 80
    
    PerlPostReadRequestHandler My::ProxyRemoteAddr
    
    Listen 4077
    <VirtualHost localhost:4077>
      ServerName www.example.com
      DocumentRoot /home/httpd/docs/www.example.com     
      DirectoryIndex index.shtml index.html
    </VirtualHost>
    
    Listen 4078
    <VirtualHost localhost:4078>
      ServerName foo.example.com
      DocumentRoot /home/httpd/docs/foo.example.com
      DirectoryIndex index.shtml index.html
    </VirtualHost>
  
  The back-end server knows to tell which virtual host the request is
  made to, by checking the port number the request was proxied to and
  using the appropriate virtual host section to handle it.
  
  We set S<"Port 80"> so that any redirects don't get sent directly to
  the back-end port.
  
  To get the I<real> remote IP addresses from proxy, the
  
L<My::ProxyRemoteAddr|scenario/Getting_the_Remote_Server_IP_in_the_Back_end_server_in_the_Proxy_Setup>
  handler is used based on the C<mod_proxy_add_forward> Apache module.
  Prior to mod_perl 1.22 this setting must have been set per-virtual
  host, since it wasn't inherited by the virtual hosts.
  
  The following configuration is yet another useful example showing the
  other way around. It specifies what to be proxied and then the rest
  is served by the front end:
  
    RewriteEngine     on
    RewriteLogLevel   0
    RewriteRule       ^/(perl.*)$  http://127.0.0.1:8052/$1   [P,L]
    NoCache           *
    ProxyPassReverse  /  http://www.example.com/
  
  So we don't have to specify the rule for static objects to be served
  by the front-end as we did in the previous example to handle files
  with the extensions I<.gif>, I<.jpg>, I<.png> and I<.txt> internally.
  
  
  =head1 Getting the Remote Server IP in the Back-end server in the Proxy Setup
  
  Ask Bjoern Hansen has written the C<mod_proxy_add_forward> module for
  Apache.  It sets the C<X-Forwarded-For> field when doing a
  C<ProxyPass>, similar to what Squid can do.  Its location is specified
  in the L<download|download/mod_proxy_add_forward> section.
  
  Basically, this module adds an extra HTTP header to proxying requests.
  You can access that header in the mod_perl-enabled server, and set the
  IP address of the remote server.  You won't need to compile anything
  into the back-end server.
  
  =head2 Build
  
  Download the module and use its location as a value of the
  I<--activate-module> argument for the I<./configure> utility within
  the Apache source code, so the module can be found.
  
    ./configure \
    "--with-layout=Apache" \
    "--activate-module=src/modules/extra/mod_proxy_add_forward.c" \
    "--enable-module=proxy_add_forward" \
    ... other options ...
  
  I<--enable-module=proxy_add_forward> enables this module as you have
  guessed already.
  
  
  =head2 Usage
  
  If you are using C<Apache::Registry> or C<Apache::PerlRun> modules
  just put the following code into I<startup.pl>:
  
    use Apache::Constants ();
    sub My::ProxyRemoteAddr ($) {
      my $r = shift;
     
      # we'll only look at the X-Forwarded-For header if the requests
      # comes from our proxy at localhost
      return Apache::Constants::OK 
          unless ($r->connection->remote_ip eq "127.0.0.1") 
              and $r->header_in('X-Forwarded-For');
    
      # Select last value in the chain -- original client's ip
      if (my ($ip) = $r->headers_in->{'X-Forwarded-For'} =~ /([^,\s]+)$/) {
        $r->connection->remote_ip($ip);
      }
          
      return Apache::Constants::OK;
    }
  
  And in the mod_perl's I<httpd.conf>:
  
    PerlPostReadRequestHandler My::ProxyRemoteAddr
  
  and the right thing will happen transparently for your scripts.
  Otherwise if you write your own mod_perl content handler, you can
  retrieve it directly in your code using a similar code.
  
  
  =head2 Security
  
  Different sites have different needs.  If you use the header to set
  the IP address, Apache believes it.  This is reflected in the logging
  for example.  You really don't want anyone but your own system to set
  the header, which is why the I<recommended code> above checks where
  the request really came from before changing C<remote_ip>.
  
  Generally you shouldn't trust the C<X-Forwarded-For> header.  You only
  want to rely on C<X-Forwarded-For> headers from proxies you control
  yourself.  If you know how to spoof a cookie you've probably got the
  general idea on making HTTP headers and can spoof the
  C<X-Forwarded-For> header as well.  The only address you can count on
  as being a reliable value is the one from
  C<r-E<gt>connection-E<gt>remote_ip>.
  
  From that point on, the remote IP address is correct.  You should be
  able to access C<$ENV{REMOTE_ADDR}> environment variable as usual.
  
  =head2 Caveats
  
  It was reported that Ben Laurie's Apache-SSL does not seem to put the
  IP addresses in the C<X-Forwarded-For> header--it does not set up such
  a header at all.  However, the C<$ENV{REMOTE_ADDR}> environment
  variable it sets up contains the IP address of the original client
  machine.
  
  Prior to mod_perl 1.22 there was a need to repeat the
  S<PerlPostReadRequestHandler My::ProxyRemoteAddr> directive for each
  virtual host, since it wasn't inherited by the virtual hosts.
  
  =head2 mod_proxy_add_forward Module's Order Precedence
  
  Some users report that they cannot get this module to work as
  advertised. They verify that the module is built in, but the front-end
  server is not generating the C<X-Forwarded-For> header when requests
  are being proxied to the back-end server. As a result, the back-end
  server has no idea what the remote IP is.
  
  As it turns out, I<mod_proxy_add_forward> needs to be configured in
  Apache before I<mod_proxy> in order to operate properly, since Apache
  gives highest precedence to the last defined module.
  
  Moving the two build options required to enable
  I<mod_proxy_add_forward> while configuring Apache appears to have no
  effect on the default configuration order of modules, since in each
  case, the resulting builds show I<mod_proxy_add_forward> last in the
  list (or first via I</server-info>).
  
  One solution is to explicitly define the configuration order in the
  I<http.conf> file, so that I<mod_proxy_add_forward> appears before
  I<mod_proxy>, and therefore gets executed after I<mod_proxy>. (Modules
  are being executed in I<reverse> order, i.e. module that was I<Added>
  first will be executed last.)
  
  Obviously, this list would need to be tailored to match the build
  environment, but to ease this task just insert an C<AddModule>
  directive before each entry reported by C<httpd -l> (and removing
  I<httpd_core.c>, of course):
  
    ClearModuleList
    AddModule mod_env.c
    [more modules snipped]
    AddModule mod_proxy_add_forward.c
    AddModule mod_proxy.c
    AddModule mod_rewrite.c
    AddModule mod_setenvif.c
  
  Note that the above snippet is added to I<httpd.conf> of the front-end
  server.
  
  Another solution is to reorder the module list during configuration by
  using one or more C<--permute-module> arguments to the I<./configure>
  utility.  (Try C<./configure --help> to see if your version of Apache
  supports this option.)  C<--permute-module=foo:bar> will swap the
  position of I<mod_foo> and I<mod_bar> in the list,
  C<--permute-module=BEGIN:foo> will move I<mod_foo> to the beginning of
  the list, and C<--permute-module=foo:END> will move I<mod_foo> to the
  end.  For example suppose your module list from C<httpd -l> looks
  like:
  
    http_core.c
    [more modules snipped]
    mod_proxy.c
    mod_setenvif.c
    mod_proxy_add_forward.c
  
  You might add the following arguments to I<./configure> to move
  I<mod_proxy_add_forward> to the position in the list just before
  I<mod_proxy>:
  
    ./configure \
    "--with-layout=Apache" \
    "--activate-module=src/modules/extra/mod_proxy_add_forward.c" \
    "--enable-module=proxy_add_forward" \
    ... other options ...
    "--permute-module=proxy:proxy_add_forward" \
    "--permute-module=setenvif:END"
  
  With this change, the C<X-Forwarded-For> header is now being sent to
  the back-end server, and the remote IP appears in the back-end
  server's I<access_log> file.
  
  
  
  
  
  
  =head1 HTTP Authentication With Two Servers Plus a Proxy
  
  Assuming that you have a setup of one "front-end" server, which
  proxies the "back-end" (mod_perl) server, if you need to perform
  authentication in the "back-end" server it should handle all
  authentication itself.  If Apache proxies correctly, it will pass
  through all authentication information, making the "front-end" Apache
  somewhat "dumb", as it does nothing but pass through the information.
  
  In the configuration file your C<Auth> configuration directives need
  to be inside the C<E<lt>Directory ...E<gt>>
  ... C<E<lt>/DirectoryE<gt>> sections because if you use the section
  C<E<lt>Location ...E<gt>> ... C<E<lt>/LocationE<gt>> the proxy server
  will take the authentication information for itself and not pass it
  on.
  
  The same applies to mod_ssl and similar Apache SSL modules. If it gets
  plugged into a front-end server, it will properly encode/decode all
  the SSL requests. So if your machine is secured from inside, your
  back-end server can do secure transactions.
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  =head1 mod_rewrite Examples
  
  In the mod_proxy + mod_perl servers scenario, C<ProxyPass> was used to
  redirect all requests to the mod_perl server, by matching the
  beginning of the relative URI (e.g. I</perl>). What should you do if
  you want everything, but files with extensions like I<.gif>, I<.cgi>
  and similar, to be proxypassed to the mod_perl server. These files are
  to be served by the light Apache server which carries the mod_proxy
  module.
  
  The following example rewrites everything to the mod_perl server. It
  locally handles all requests for files with extensions I<gif>, I<jpg>,
  I<png>, I<css>, I<txt>, I<cgi> and relative URIs starting with
  I</cgi-bin> (e.g. if you want some scripts to be executed under
  mod_cgi).
  
    RewriteEngine On
    # handle GIF and JPG images and traditional CGI's directly
    RewriteRule \.(gif|jpg|png|css|txt|cgi)$ - [last]
    RewriteRule ^/cgi-bin - [last]
    # pass off everything but images to the heavy-weight server via proxy
    RewriteRule ^/(.*)$ http://localhost:4077/$1 [proxy]
  
  That is, first, handle locally what you want to handle locally, then
  hand off everything else to the back-end guy.
  
  This is the configuration of the logging facilities. 
  
    RewriteLogLevel 1
    RewriteLog "| /usr/local/apache_proxy/bin/rotatelogs \
    /usr/local/apache-common/logs/r_log 86400"
  
  It says: log all the rewrites thru the pipe to the C<rotatelogs>
  utility which will rotate the logs every 2 hours (86400 secs).
  
  More examples:
  
  Redirect all those IE5 requests for I<favicon.ico> to a central image:
  
    RewriteRule .*favicon.ico /wherever/favicon.ico [PT,NS]
  
  A quick way to make dynamic pages look static:
  
    RewriteRule ^/wherever/([a-zA-Z]+).html /perl-bin/$1.cgi [PT]
  
  Instead of keeping all your Perl scripts in I</perl> and your static
  content everywhere else, you could keep your static content in special
  directories and keep your Perl scripts everywhere else.  You can still
  use the light/heavy apache separation approach described before, with
  a few minor modifications.
  
  In the I<light> Apache's I<httpd.conf> file, turn rewriting on:
  
    RewriteEngine On
  
  Now list all directories that contain only static objects. For example
  if the only relative to C<DocumentRoot> directories are I</images> and
  I<style> you can set the following rule:
  
    RewriteRule ^/(images|style) - [L]
  
  The C<[L]> (I<Last>) means that the rewrite engine should stop if it
  has a match.  This is necessary because the very last rewrite rule
  proxies everything to the I<heavy> server:
  
    RewriteRule ^/(.*) http://www.example.com:8080/$1 [P]
  
  This line is the difference between a server for which static content
  is the default and one for which dynamic (perlish) content is the
  default.
  
  You should also add the I<reverse rewrite rule> as before:
  
    ProxyPassReverse / http://www.example.com/
  
  so that the user doesn't see the port number C<:8080> in the browser's
  location window.
  
  It is possible to use C<localhost> in the C<RewriteRule> above if the
  heavy and light servers are on the same machine.  So if we sum up the
  above setup we get:
  
    RewriteEngine On
    RewriteRule ^/(images|style) - [L]
    RewriteRule ^/(.*) http://www.example.com:8080/$1 [P]
    ProxyPassReverse / http://www.example.com/
  
  
  
  
  
  =head1 Caching in mod_proxy
  
  This is not really mod_perl related, so I'll just stress one
  point. If you want the caching to work the following HTTP headers
  should be supplied: C<Last-Modified>, C<Content-Length> and
  C<Expires>.
  
  =cut
  
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/security.pod
  
  Index: security.pod
  ===================================================================
  =head1 NAME
  
  Protecting Your Site
  
  =head1 The Importance of Your site's Security
  
  Let's face it, your site or service can easily become a target for
  Internet "terrorists". It can be because of something you said, the
  success of your site, or for no obvious reason whatever. If your site
  security is compromised, all your data can be deleted or important
  information can be stolen.  You may risk legal action or the sack if
  this happens.
  
  Your site can be paralyzed through a _simple_ I<denial of service>
  (DoS) attack.
  
  Whatever you do, as long as you are connected to the network your site
  will be vulnerable. Cut the connections, turn off your machine and put
  it into a safe.  Now it is protected--but useless.
  
  So what can you do?
  
  Let's first get acquainted with some security related terminology:
  
  =over 4
  
  =item Authentication
  
  When you want to make sure that a user is who he claims to be, you
  generally ask her for a username and a password. Once you have both,
  you can check them against your L<database of username/password
  pairs|databases/>.  If they match, the user has passed the
  B<Authentication> stage. From now on if you keep the
  
L<session|modules/Apache__Session___Maintain_session_state_across_HTTP_requests>
 open all you need to
  do is to remember the username.
  
  =item Authorization
  
  You might want to allow user B<foo> to have access to some resource,
  but restrict her from accessing another resource, which in turn is
  accessible only for user B<bar>. The process of checking access rights
  is called B<Authorization>. For B<Authorization> all you need is an
  authenticated username or some other attribute which you can authorize
  against. For example, you can authorize against IP number, allowing only
  your local users to use some service. But be warned that IP numbers or
  session_ids can be spoofed (forged), and that is why you should not do
  B<Authorization> without B<Authentication>.
  
  
  =back
  
  Actually you've been familiar with both these concepts for a while.
  
  When you telnet to your account on some machine you go through a login
  process (B<Authentication>).
  
  When you try to read some file from your file systems, the kernel
  checks the permissions on this file (B<Authorization>). You may hear
  about B<Access control> which is another name for the same thing.
  
  
  =head1 Illustrated Security Scenarios
  
  I am going to present some real world security requirements and their
  implementations.
  
  =head2 Non authenticated access for internal IPs, Authenticated for external 
IPs
  
  An B<Extranet> is very similar to an B<Intranet>, but at least partly
  accessible from outside your organization. If you run an Extranet you
  might want to let your internal users have unrestricted access to your
  web server. If these same users call from outside your organization
  you might want to make sure that they are in fact your employees.
  
  These requirements are achieved very simply by putting the IP
  patterns of the organization in a Perl Access Handler in an
  C<.htaccess> file.  This sets the REMOTE_USER environment variable
  to the organization's generic username. Scripts can test the
  C<REMOTE_USER> environment variable to determine whether to allow
  unrestricted access or else to require authentication.
  
  Once a user passes the authentication stage, either bypassing it because
  of his IP address or after entering a correct login/password pair, the
  C<REMOTE_USER> variable is set. Then we can talk about authorization.
  
  Let's see the implementation of the authentication stage. First we
  modify I<httpd.conf>:
  
    PerlModule My::Auth
    
    <Location /private>
      PerlAccessHandler My::Auth::access_handler
      PerlSetVar Intranet "10.10.10.1 => userA, 10.10.10.2 => userB"
      PerlAuthenHandler My::Auth::authen_handler
      AuthName realm
      AuthType Basic
      Require valid-user
      Order deny, allow
      Deny from all
    </Location>
  
  Now the code of My/Auth.pm:
  
      sub access_handler {
    
          my $r = shift;
    
          unless ($r->some_auth_required) {
                  $r->log_reason("No authentication has been configured");
                  return FORBIDDEN;
          }
          # get list of IP addresses
          my %ips = split /\s*(?:=>|,)\s*/, $r->dir_config("Intranet");
    
          if (my $user = $ips{$r->connection->remote_ip}) {
    
                  # update connection record
                  $r->connection->user($user);
    
                  # do not ask for a password
                  $r->set_handlers(PerlAuthenHandler => [\&OK]);
          }
          return OK;
      }
       
      sub authen_handler {
    
          my $r = shift;
    
          # get user's authentication credentials
          my ($res, $sent_pw) = $r->get_basic_auth_pw;
          return $res if $res != OK;
          my $user = $r->connection->user;
    
          # authenticate through DBI
          my $reason = authen_dbi($r, $user, $sent_pw);
    
          if ($reason) {
                  $r->note_basic_auth_failure;
                  $r->log_reason($reason, $r->uri);
                  return AUTH_REQUIRED;
          }
          return OK;
      }
       
      sub authen_dbi{
        my ($r, $user, $sent_pw) = @_;
    
        # validate username/passwd
    
        return 0 if (*PASSED*) # replace with real code!!!
    
        return "Failed for X reason";
    
      }
      # don't forget 1;
      1;
  
  You can implement your own C<authen_dbi()> routine, or you can replace
  C<authen_handler()> with an existing authentication handler such as
  C<Apache::AuthenDBI>.
  
  If one of the IP addresses is matched, C<access_handler()> sets
  C<REMOTE_USER> to be either C<userA> or C<userB>.
  
  If neither IP address is matched, C<PerlAuthenHandler> will not be set
  to OK, and the Authentication stage will ask the user for a login and
  password.
  
  
  =head1 Authentication code snippets
  
  =head2 Forcing re-authentication
  
  To force an authenticated user to reauthenticate just send the following
  header to the browser:
  
    WWW-Authenticate: Basic realm="My Realm"
    HTTP/1.0 401 Unauthorized
  
  This will pop-up (in Netscape at least) a window saying
  B<Authorization Failed.  Retry?> with B<OK> and a B<Cancel> buttons.
  When that window pops up you know that the password has been
  discarded.  If the user hits the B<Cancel> button the username will
  also be discarded. If she hits the B<OK> button, the authentication
  window will be brought up again with the previous username already in
  place.
  
  In the Perl API you would use the note_basic_auth_failure() method to
  force reauthentication.
  
  This may not work!  The browser's behaviour is in no way guaranteed.
  
  
  =head2 OK, AUTH_REQUIRED and FORBIDDEN in Authentication handlers
  
  When your authentication handler returns OK, it means that user has
  correctly authenticated and now C<$r-E<gt>connection-E<gt>user> will
  have the username set for subsequent requests. For C<Apache::Registry>
  and friends, where the environment variable settings weren't erased,
  an equivalent C<$ENV{REMOTE_USER}> variable will be available. 
  
  The password is available only through the Perl API with the help of
  the get_basic_auth_pw() method.
  
  If there is a failure, unless it's the first time, the C<AUTH_REQUIRED>
  flag will tell the browser to pop up an authentication window, to try
  again. For example:
  
     my($status, $sent_pw) = $r->get_basic_auth_pw;
     unless($r->connection->user and $sent_pw) {
         $r->note_basic_auth_failure;
         $r->log_reason("Both a username and password must be provided");
         return AUTH_REQUIRED;
     }
  
  Let's say that you have a mod_perl authentication handler, where the
  user's credentials are checked against a database.  It returns either
  C<OK> or C<AUTH_REQUIRED>. One of the possible authentication failure
  case might happen when the username/password are correct, but the
  user's account has been suspended temporarily.
  
  If this is the case you would like to make the user aware of this, by
  displaying a page, instead of having the browser pop up the
  authentication dialog again. You will also refuse authentication, of
  course.
  
  The solution is to return C<FORBIDDEN>, but before that you should set
  a custom error page for this specific handler, with help of
  C<$r-E<gt>custom_response>. It looks something like this:
  
    use Apache::Constants qw(:common);
    $r->custom_response(SERVER_ERROR, "/errors/suspended_account.html");
     
    return FORBIDDEN if $suspended;
  
  
  =head1 Apache:Auth* modules
  
  =over 
  
  =item * PerlAuthenHandler's
  
    Apache::AuthAny           Authenticate with any username/password 
    Apache::AuthenCache       Cache authentication credentials        
    Apache::AuthCookie        Authen + Authz via cookies              
    Apache::AuthenDBI         Authenticate via Perl's DBI             
    Apache::AuthExpire        Expire Basic auth credentials           
    Apache::AuthenGSS         Generic Security Service (RFC 2078)     
    Apache::AuthenIMAP        Authentication via an IMAP server       
    Apache::AuthenPasswdSrv   External authentication server          
    Apache::AuthenPasswd      Authenticate against /etc/passwd        
    Apache::AuthLDAP          LDAP authentication module              
    Apache::AuthPerLDAP       LDAP authentication module (PerLDAP)    
    Apache::AuthenNIS         NIS authentication                      
    Apache::AuthNISPlus       NIS Plus authentication/authorization   
    Apache::AuthenRaduis      Authentication via a Radius server      
    Apache::AuthenSmb         Authenticate against NT server          
    Apache::AuthenURL         Authenticate via another URL            
    Apache::DBILogin          Authenticate to backend database        
    Apache::DCELogin          Obtain a DCE login context              
    Apache::PHLogin           Authenticate via a PH database          
    Apache::TicketAccess      Ticket based access/authentication      
  
  =item  * PerlAuthzHandler's
  
    Apache::AuthCookie        Authen + Authz via cookies              
    Apache::AuthzAge          Authorize based on age                  
    Apache::AuthzDCE          DFS/DCE ACL based access control        
    Apache::AuthzDBI          Group authorization via Perl's DBI      
    Apache::AuthzGender       Authorize based on gender               
    Apache::AuthzNIS          NIS authorization                       
    Apache::AuthzPasswd       Authorize against /etc/passwd           
    Apache::AuthzSSL          Authorize based on client cert          
    Apache::RoleAuthz         Role-based authorization                
  
  =item * PerlAccessHandler's
  
    Apache::AccessLimitNum    Limit user access by number of requests 
    Apache::BlockAgent        Block access from certain agents        
    Apache::DayLimit          Limit access based on day of week       
    Apache::IPThrottle        Limit bandwith consumption by IP        
    Apache::RobotLimit        Limit access of robots                  
    Apache::SpeedLimit        Control client request rate             
  
  =back
  
  =cut
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/snippets.pod
  
  Index: snippets.pod
  ===================================================================
  =head1 NAME
  
  Code Snippets
  
  
  
  
  
  =head1 Redirecting Errors to the Client Instead of error_log
  
  Many error conditions result in an I<exception> (or I<signal> -- same
  thing) which is I<raised> in order to tell the operating system that a
  condition has arisen which needs more urgent attention than can be
  given by other means.  One of the most familiar ways of raising a
  signal is to hit C<Ctrl-C> on your terminal's keyboard.  The signal
  interrupts the processor.  In the case of C<Ctrl-C> the C<INT> signal
  is generated and the interrupt is usually I<trapped> by a default
  I<signal handler> supplied by OS, which causes the operating system to
  stop the process currently attached to the terminal.
  
  Under mod_perl, a Perl runtime error causes an exception.  By default
  this exception is trapped by default mod_perl handler.  The handler
  logs information about the error (such as the date and time that the
  error occurred) to I<error_log>.  If you want to redirect this
  information to the client instead of to I<error_log> you have to take
  the responsibility yourself, by writing your own exception handler to
  implement this behaviour.  See the section "L<Exception Handling for
  mod_perl|perl/Exception_Handling_for_mod_perl>" for more information.
  
  The code examples below can be useful with your own exception handlers 
  as well as with the default handlers.
  
  META: Integrate the 2 sections
  
  The CGI::Carp package implements handlers for signals.  To trap
  (almost) all Perl run-time errors and send the output to the client
  instead of to Apache's C<error_log> add this line to your script:
  
    use CGI::Carp qw(fatalsToBrowser);
  
  Refer to the C<CGI::Carp> man page for more detailed information.
  
  You can trap individual exceptions: for example you can write custom
  C<__DIE__> and C<__WARN__> signal handlers.  The special C<%SIG> hash
  contains references to signal handlers.  The signal handler is just a
  subroutine, in the example below it is called "mydie".  To install the
  handler we assign a reference to our handler to the appropriate
  element of the C<%SIG> hash.  This causes the signal handler to call
  mydie(error_message) whenever the C<die()> sub is called as a result
  of something which happened when our script was executing.
  
  Do not forget the C<local> keyword!  If you do, then after the signal
  handler has been loaded it will be called whenever C<die()> is called
  by I<any> script executed by the same process. Probably that's not
  what you want.  If it is, you can put the assignment statement in any
  module, as long as it will be executed at the right time.
  
  Here is an example of a handler which I wrote because I wanted users
  to know that there was an error, without displaying the error message,
  but instead email it to me. If the error is caused by user
  (e.g. uploading image whose size is bigger than the limit I had set) I
  want to tell them about it.  I wrote this handler for the mod_perl
  environment, but it works correctly when called from the shell.  The
  code shown below is a stripped-down version with additional comments.
  
  The following code must be added to the script:
  
    # Using the local() keyword restricts the scope of the directive to 
    # the block in which it is found, so this line must be added at the 
    # right place in the right script.  It will not affect other blocks 
    # unless the local() keyword is removed.  Usually you will want the 
    # directive to affect the entire script, so you just place it near 
    # the beginning of the file, where the innermost enclosing block is 
    # the file itself.
    local $SIG{__DIE__} = \&mydie;
  
    # The line above assumes that the subroutine "mydie" is in the same script.
    # Alternatively you can create a separate module for the error handler.
    # If you put the signal handler in a separate module, e.g. Error.pm,
    # you must explicitly give the package name to set the handler in your
    # script, using a line like this instead of the one above:
    local $SIG{__DIE__} = \&Error::mydie;
    # again within the script!
    
    # Do not forget the C<local()>, unless you want this signal handler to
    # be invoked every time any scripts dies (including events where this
    # treatment may be undesirable).
  
    my $max_image_size = 100*1024; # 100k
    my $admin_email    = '[EMAIL PROTECTED]';
  
    # and the handler itself
    # Here is the handler itself:
    # The handler is called with a text message in a scalar argument
    sub mydie{
      my $why = shift;
    
      chomp $why;
      my $orig_why = $why;              # an ASCII copy for email report
    
      # handle the shell execution case (so we will not get all the HTML)
      print("Error: $why\n"), exit unless $ENV{MOD_PERL};
    
      my $should_email = 0;
      my $message = '';
    
      $why =~ s/[<&>]/"&#".ord($&).";"/ge;    # entity escape
    
      # Now we need to trap various kinds of errors that come from CGI.pm
      # We don't want these errors to be emailed to us, since
      # they aren't programmatical errors
      if ($orig_why =~ /Client attempted to POST (\d+) bytes/o) {
    
        $message = qq{
                  You cannot POST messages bigger than 
                  @{[1024*$max_image_size]} bytes.<BR>
                  You have tried to post $1 bytes<BR>
                  If you are trying to upload an image, make sure its
                  size is no bigger than @{[1024*$max_image_size]}
                  bytes.<P>
                  Thank you!
                 };
    
      } elsif ($orig_why =~ /Malformed multipart POST/o) {
    
        $message = qq{
                  Have you tried to upload an image in the wrong way?<P>
                  To successfully upload an image you must use a browser that 
supports
                  image upload and use the 'Browse' button to select that image.
                  DO NOT type the path to the image into the upload field.<P>
                  Thank you!
                 };
    
      } elsif ($orig_why =~ /closed socket during multipart read/o) {
    
        $message = qq{
                  Have you pressed a 'STOP' button?<BR>
                  Please try again!<P>
                  Thank you!
                 };
  
      } else {
    
        $message = qq{
                    <B>You need take no action since
                  the error report has already been 
                  sent to the webmaster. <BR><P>
                  <B>Thank you for your patience!</B>
                 };
    
        $should_email = 1;
      }
    
    
      print qq{Content-type: text/html
    
    <HTML><BODY BGCOLOR="white">
    <B>Oops, Something went wrong.</B><P>
    $message
    </BODY></HTML>};    
    
        # send email report if appropriate
      if ($should_email){
    
          # import sendmail subs
        use Mail ();
          # prepare the email error report:
        my $subject ="Error Report";
        my $body = qq|
      An error has happened:
    
      $orig_why
    
        |;
  
          # send error reports to admin 
        send_mail($admin_email,$admin_email,$subject,$body);
        print STDERR "[".scalar localtime()."] [SIGDIE] Sending Error Email\n";
      }
    
         # print to error_log so we will know there was an error
      print STDERR "[".scalar localtime()."] [SIGDIE] $orig_why \n";
    
      exit 1;
    }                           # end of sub mydie
    
  
  You may have noticed that I trap the CGI.pm's die() calls here, I
  don't see any reason why my users should see ugly error messages, but
  that's the way CGI.pm written. The workaround is to trap them
  yourself.
  
  Please note that as of version 2.49, CGI.pm provides the cgi_error()
  method to print the errors and won't die() unless you want it to.
  
  
  
  
  
  =head1 Reusing Data from POST request
  
  What happens if you need to access the POSTed data more than once, say
  to reuse it in subsequent handlers of the same request?  POSTed data
  comes directly from the socket, and at the low level data can only be
  read from a socket once.  So you have to store it to make it available
  for reuse.
  
  There is an experimental option for C<Makefile.PL> called
  C<PERL_STASH_POST_DATA>.  If you turn it on, you can get at it again
  with C<$r-E<gt>subprocess_env("POST_DATA")>.  This is not I<enabled>
  by default because it adds a processing overhead for each POST
  request.
  
  But what do we do with large multipart file uploads?  Because C<POST>
  data is not all read in one clump, it's a problem that's not easy to
  solve in a general way. A transparent way to do this is to switch the
  request method from POST to GET, and store the POST data in the query
  string. This handler does exactly this:
  
    Apache/POST2GET.pm
    ------------------
    package Apache::POST2GET;
    use Apache::Constants qw(M_GET OK DECLINED);
    
    sub handler {
      my $r = shift;
      return DECLINED unless $r->method eq "POST";
      $r->args(scalar $r->content);
      $r->method('GET');
      $r->method_number(M_GET);
      $r->headers_in->unset('Content-length');
      return OK;
    }
    1;
    __END__
  
  In I<httpd.conf> add:
  
    PerlInitHandler Apache::POST2GET
  
  or even this:
  
    <Limit POST>
      PerlInitHandler Apache::POST2GET
    </Limit>
  
  To save a few more cycles, so the handler will be called only for POST
  requests.
  
  Effectively, this trick turns the POST request into a GET request
  internally.  Now when C<CGI.pm>, C<Apache::Request> or whatever module
  parses the client data, it can do so more than once since
  C<$r-E<gt>args> doesn't go away (unless you make it go away by
  resetting it).
  
  If you are using C<Apache::Request>, it solves this problem for you
  with its instance() class method, which allows C<Apache::Request> to
  be a singleton.  This means that whenever you call
  C<Apache::Request>-E<gt>instance() within a single request you always
  get the same C<Apache::Request> object back.
  
  
  
  
  
  =head1 Redirecting POST Requests
  
  Under mod_cgi it's not easy to redirect POST requests to some other
  location.  With mod_perl you can easily redirect POST requests.  All
  you have to do is read in the content, set the method to C<GET>,
  populate C<args()> with the content to be forwarded and finally do the
  redirect:
  
    use Apache::Constants qw(M_GET);
    my $r = shift;
    my $content = $r->content;
    $r->method("GET");
    $r->method_number(M_GET);
    $r->headers_in->unset("Content-length");
    $r->args($content);
    $r->internal_redirect_handler("/new/url");
  
  Of course that last line can be any other kind of redirect.
  
  
  =head1 Redirecting While Maintaining Environment Variables
  
  Let's say you have a module that sets some environment variables.
  
  If you redirect, that's most likely telling the web browser to fetch
  the new page.  This makes it a totally new request, so no environment
  variables are preserved.
  
  However, if you're using internal_redirect(), you can make the
  environment variables seen in the sub-process via
  subprocess_env(). The only nuance is that the C<%ENV> keys will be
  prefixed with C<REDIRECT_>.
  
  
  
  
  
  
  
  
  
  
  
  
  
  =head1 Terminating a Child Process on Request Completion
  
  If you want to terminate the child process serving the current
  request, upon completion of processing anywhere in the code call:
  
    $r->child_terminate;
  
  Apache won't actually terminate the child until everything it needs to 
  do is done and the connection is closed.
  
  
  
  =head1 Setting Content-type and Content-encoding headers in non-OK responses
  
  You cannot set I<Content-type> and I<Content-encoding> headers in
  non-OK responses, since Apache overrides these in I<http_protocol.c>,
  I<ap_send_error_response()>:
  
    r->content_type = "text/html; charset=iso-8859-1";
  
  
  =head1 More on Relative Paths
  
  Many people use relative paths for C<require>, C<use>, etc., and when
  they open files in their scripts they make assumptions about the
  current directory.  This will fail if you don't C<chdir()> to the
  correct directory first (as could easily happen if you have another
  script which calls the first script by its full path).
  
  For example:
  
    /home/httpd/perl/test.pl:
    -------------------------
    #!/usr/bin/perl
    open IN, "./foo.txt";
    -------------------------
  
  This snippet would work if we call the script like this:
  
    % chdir /home/httpd/perl
    % ./test.pl
  
  since C<foo.txt> is located in the current directory.  But when the
  current directory isn't I</home/httpd/perl>, if we call the script
  like this:
  
    % /home/httpd/perl/test.pl
  
  then the script will fail to find C<foo.txt>.  Think about 
  C<crontab>s!
  
  Notice that you cannot use the C<FindBin.pm> package, something that
  you'd do in the regular Perl scripts, because it relies on the
  C<BEGIN> block it won't work under mod_perl. It's loaded and executed
  only for the first script executed inside the process, all the others
  will use the cached value, which would be probably incorrect if they
  reside in different directories.
  
  
  
  
  
  
  
  
  
  
  
  =head1 Watching the error_log File Without Telneting to the Server
  
  I wrote this script a long time ago, when I had to debug my CGI
  scripts but didn't have access to the C<error_log> file.  I asked 
  the admin to install this script and have used it happily since then.
  
  If your scripts are running on these 'Get-free-site' servers, and you
  cannot debug your script because you can't telnet to the server or
  can't see the C<error_log>, you can ask your sysadmin to install this
  script.
  
  Note, that it was written for plain Apache, and isn't prepared to
  handle the complex multiline error and warning messages generated by
  mod_perl.  It also uses a system() call to do the main work with the 
  tail() utility, probably a more efficient perl implementation is due 
  (take a look at C<File::Tail> module).  You are welcome to fix it and
  contribute it back to mod_perl community.  Thank you!
  
  Here is the code:
  
    # !/usr/bin/perl -Tw
    
    use strict;
    
    my $default   = 10;
    my $error_log = "/usr/local/apache/logs/error_log";
    use CGI;
    
    # untaint $ENV{PATH}
    $ENV{'PATH'} = '/bin:/usr/bin';
    delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};
      
    my $q = new CGI;
    
    my $counts = (defined $q->param('count') and $q->param('count'))
      ? $q->param('count') : $default;
    
    print $q->header,
          $q->start_html(-bgcolor => "white",
                       -title   => "Error logs"),
          $q->start_form,
          $q->center(
                   $q->b('How many lines to fetch? '),
                   $q->textfield('count',10,3,3),
                   $q->submit('', 'Fetch'),
                   $q->reset,
                  ),
          $q->end_form,
          $q->hr;
    
    # untaint $counts
    $counts = ($counts =~ /(\d+)/) ? $1 : 0;
    
    print($q->b("$error_log doesn't exist!!!")),exit unless -e $error_log;
    
    open LOG, "tail -$counts $error_log|" 
      or die "Can't tail $error_log :$!\n";
    my @logs = <LOG>;
    close LOG;
      # format and colorize each line nicely
    foreach (@logs) {
        s{
         \[(.*?)\]\s* # date
         \[(.*?)\]\s* # type of error 
         \[(.*?)\]\s* # client part
         (.*)         # the message
        }
        {
          "[$1] <BR> [".
          colorize($2,$2).
          "] <BR> [$3] <PRE>".
          colorize($2,$4).
          "</PRE>"
        }ex;
      print "<BR>$_<BR>"; 
    }
    
    
    
    #############
    sub colorize{
      my ($type,$context) = @_;
    
      my %colors = 
        (
         error  => 'red',
         crit   => 'black',
         notice => 'green',
         warn   => 'brown',
        );
    
      return exists $colors{$type}
          ? qq{<B><FONT COLOR="$colors{$type}">$context</FONT></B>}
          : $context;
    }
  
  
  
  
  
  =head1 Accessing Variables from the Caller's Package
  
  Sometimes you want to access variables from the caller's
  package. One way is to do something like this:
  
    {
      no strict 'vars' ;
      my $caller = caller;
      print qq[$caller --- ${"${caller}::var"}];
    }
  
  
  
  
  
  =head1 Handling Cookies
  
  Unless you use some well known module like C<CGI::Cookie> or
  C<Apache::Cookie>, you need to handle cookies yourself.
  
  Cookies come in the C<$ENV{HTTP_COOKIE}> variable.  You can print the
  raw cookie string as C<$ENV{HTTP_COOKIE}>.
  
  Here is a fairly well-known bit of code to take cookie values and put
  them into a hash:
  
    sub get_cookies {
        # cookies are separated by a semicolon and a space, this will
        # split them and return a hash of cookies
      local(@rawCookies) = split (/; /,$ENV{'HTTP_COOKIE'});
      local(%cookies);
    
      foreach(@rawCookies){
        ($key, $val) = split (/=/,$_);
        $cookies{$key} = $val;
      }
    
      return %cookies;
    }
  
  Or a slimmer version:
  
      sub get_cookies {
          map { split /=/, $_, 2 } split /; /, $ENV{'HTTP_COOKIE'} ;
      }
          
  
  
  
  
  
  
  
  =head1 Sending Multiple Cookies with the Perl API
  
  Given that you have prepared your cookies in C<@cookies>, the
  following code will submit all the cookies:
  
    for (@cookies){
     $r->headers_out->add( 'Set-Cookie' => $_ );
    }
  
  
  
  =head1 Sending Cookies in REDIRECT Response
  
  You should use err_headers_out() and not headers_out() when you want
  to send cookies in the C<REDIRECT> response.
  
    use Apache::Constants qw(REDIRECT OK);
    my $r = shift;
    # prepare the cookie in $cookie
    $r->err_headers_out->add('Set-Cookie' => $cookie);
    $r->headers_out->set(Location => $location);
    $r->status(REDIRECT);
    $r->send_http_header;
    return OK;
  
  
  
  
  =head1 Passing and Preserving Custom Data Structures Between Handlers
  
  Let's say that you wrote a few handlers to process a request, and they
  all need to share some custom Perl data structure.  The pnotes()
  method comes to your rescue.
  
     # a handler that gets executed first
     my %my_data = (foo => 'mod_perl', bar => 'rules');
     $r->pnotes('my_data' => \%my_data);
  
  The handler prepares the data in hash C<%my_data> and calls pnotes()
  method to store the data internally for other handlers to re-use.  All
  the subsequently called handlers can retrieve the stored data in this
  way:
  
     my $info = $r->pnotes('my_data');
     print $info->{foo};
  
  prints:
  
    mod_perl
  
  The stored information will be destroyed at the end of the request.
  
  
  
  
  
  
  
  =head1 Passing Notes Between mod_perl and other (non-Perl) Apache Modules
  
  The notes() method can be used to make various Apache modules talk to
  each other.  In the following snippet the PHP module talks to the
  mod_perl code (PHP code):
  
    if (isset($user) && substr($user,0,1) == "+") {
      apache_note("user", substr($user,1));
      virtual("/internal/getquota");
      $quota      = apache_note("quota");
      $usage_pp   = apache_note("usage_pp");
      $percent_pp = apache_note("percent_pp");
      if ($quota)
        $message .= " | Using $percent_pp% of $quota_pp limit";
    }
  
  The PHP code sets the I<user> and the username pair using the notes
  mechanism. Then issuing a sub-request to a perl handler:
  
    use Apache::Constants qw(REDIRECT OK);
    my $r = shift;
    my $notes = $r->main->notes();
    my ($quota,usage_pp,percent_pp) = getquota($notes->{user}||'');
    $r->notes('quota',$quota);
    $r->notes('usage_pp',$usage_pp);
    $r->notes('percent_pp',$percent_pp);
    return OK;
  
  which retrieves the username from the notes (using
  C<$r-E<gt>main-E<gt>notes>), uses some getquota() function to get the
  quota related data and then sets the acquired data in the notes for
  the PHP code. Now the PHP code reads the data from the notes and
  proceeds with setting C<$message> if C<$quota> is set.
  
  So any Apache modules can communicate with each other over the Apache
  notes() mechanism.
  
  You can use notes along with the sub-request methods lookup_uri() and
  lookup_filename() too. To make it work, you need to set a note in the
  sub-request. For example if you want to call a php sub-request from
  within mod_perl and pass it a note, you can do it in the following
  way:
  
      my $subr = $r->lookup_uri('wizard.php3');
      $subr->notes('answer' => 42);
      $subr->run;
  
  As of the time of this writing you cannot access the parent request
  tables from a PHP handler, therefore you must set this note for the
  sub-request. Whereas if the sub-request is running in the mod_perl
  domain, you can always keep the notes in the parent request notes
  table and access them via the method main():
  
    $r->main->notes('answer');
  
  
  =head1 Passing Environment Variables Between Handlers
  
  This is a simple example of passing environment variables between
  handlers:
  
  Having a configuration:
  
    PerlAccessHandler My::Access
    PerlLogHandler My::Log
  
  and in I<startup.pl>:
  
    sub My::Access::handler {
      my $r = shift;
      $r->subprocess_env(TICKET => $$);
      $r->notes(TICKET => $$);
    }
    
    sub My::Log::handler {
      my $r = shift;
      my $env = $r->subprocess_env('TICKET');
      my $note = $r->notes('TICKET');
      warn "env=$env, note=$note\n";
    }
  
  Adding C<%{TICKET}e> and C<%{TICKET}n> to the C<LogFormat> for
  access_log works fine too.
  
  
  =head1 Verifying Whether A Request Was Received Over An SSL Connection
  
  Just like C<$ENV{MODPERL}> is checked to see whether the code is run
  under mod_perl, C<$ENV{HTTPS}> is set by ssl modules and therefore can
  be used to check whether a request is running over SSL connection. For
  example:
  
    print "SSL" if $ENV{HTTPS};
  
  If C<PerlSetupEnv Off> setting is in effect, C<$ENV{HTTPS}> won't be
  available, and then:
  
    print "SSL" if $r->subprocess_env('https');
  
  should be used instead.
  
  Note that it's also possible to check the scheme:
  
    print "SSL" if Apache::URI->parse($r)->scheme =~ m/^https/;
  
  but it's not one hundred percent certain unless you control the server
  and you know that you run a secure server on the port 443.
  
  
  =head1 CGI::params in the mod_perl-ish Way
  
  You can retrieve the request parameters in a similar to C<CGI::params>
  way using this technique:
  
    my $r = shift;  # or $r = Apache->request
    my %params = $r->method eq 'POST' ? $r->content : $r->args;
  
  assuming that all your variables are single key-value pairs.
  
  Also take a look at C<Apache::Request> which has the same API as
  C<CGI.pm> for extracting and setting request parameters.
  
  
  
  
  
  
  
  =head1 Subclassing Apache::Request
  
  To subclass a package you simply modify @ISA, for example:
  
    package My::TestAPR;
      
    use strict;
    use vars qw/@ISA/;
    @ISA = qw/Apache::Request/;
    
    sub new {
          my ($proto, $apr) = @_;
          my $class = ref($proto) || $proto;
          bless { _r => $apr }, $class;
    }
    
    sub param {
          my ($self, $key) = @_;
          my $apr = $self->{_r};
          # Here we are calling the Apache::Request object's param method
          $apr->param($key);
    }
    
    sub sum {
          my ($self, $key) = @_;
          my $apr = $self->{_r};
          my @values = $apr->param($key);
          my $sum = 0;
          for (@values) {
                  $sum += $_;
          }
          $sum;
    }
    1;
    __END__
  
  
  
  
  
  
  
  
  
  
  =head1 Sending Email from mod_perl
  
  There is nothing special about sending email from mod_perl, it's just
  that we do it a lot.  There are a few important issues.  The most
  widely used approach is starting a C<sendmail> process and piping the
  headers and the body to it.  The problem is that C<sendmail> is a very
  heavy process and it makes mod_perl processes less efficient.
  
  If you don't want your process to wait until delivery is complete, you
  can tell C<sendmail> not to deliver the email straight away, but
  either do it in the background or just queue the job until the next
  queue run.  This can significantly reduce the delay for the mod_perl
  process which would otherwise have to wait for the C<sendmail> process
  to complete.  This can be specified for all deliveries in
  I<sendmail.cf> or on each invocation on the sendmail command line:
  
  =over
  
  =item *
  
  C<-odb> (deliver in the background)
  
  =item *
  
  C<-odq> (queue-only) or
  
  =item *
  
  C<-odd> (queue, and also defer the DNS/NIS lookups).
  
  =back
  
  The trend is to move away from sendmail(1) and switch to using lighter
  mail delivery programs like qmail(1) or postfix(1). You should check
  the manpage of your favorite mailer application for equivalent
  configuration presented for sendmail(1).
  
  The most efficient approach is to talk directly to the SMTP server.
  Luckily C<Net::SMTP> modules makes this very easy.  The only problem
  is when C<Net::SMTP> fails to deliver the mail, because the destination
  peer server is temporarily down.  But from the other side C<Net::SMTP>
  allows you to send email much faster, since you don't have to
  invoke a dedicated process.  Here is an example of a subroutine that
  sends email.
  
    use Net::SMTP ();
    use Carp qw(carp verbose);
    
    #
    # Sends email by using the SMTP Server
    #
    # The SMTP server as defined in Net::Config 
    # Alternatively you can hardcode it here, look for $smtp_server below 
    #
    sub send_mail{
      my ($from, $to, $subject, $body) = @_;
  
      carp "From missing" unless defined $from ; # Prefer to exit early if 
errors
      carp "To missing"   unless defined $to ;
    
      my $mail_message = <<__END_OF_MAIL__;
    To: $to
    From: $from
    Subject: $subject
    
    $body
    
    __END_OF_MAIL__
    
        # Set this parameter if you don't have a valid Net/Config.pm
        # entry for SMTP host and uncomment it in the Net::SMTP->new
        # call
      # my $smtp_server = 'localhost';
    
        # init the server
      my $smtp = Net::SMTP->new(
                            # $smtp_server,
                            Timeout => 60, 
                            Debug   => 0,
                           );
    
      $smtp->mail($from) or carp ("Failed to specify a sender [$from]\n");
      $smtp->to($to) or carp ("Failed to specify a recipient [$to]\n");
      $smtp->data([$mail_message]) or carp ("Failed to send a message\n");
    
      $smtp->quit or carp ("Failed to quit\n");
    
    } #  end of sub send_mail
  
  
  
  
  
  
  
  
  =head1 A Simple Handler To Print The Environment Variables
  
  The code:
  
    package MyEnv;
    use Apache;
    use Apache::Constants;
    sub handler{ 
      my $r = shift; 
      print $r->send_http_header("text/plain"); 
      print map {"$_ => $ENV{$_}\n"} keys %ENV;
      return OK;
    }
    1;
  
  The configuration:
  
    PerlModule MyEnv
    <Location /env>
      SetHandler perl-script
      PerlHandler MyEnv
    </Location>
  
  The invocation:
  
    http://localhost/env
  
  
  
  
  
  
  =head1 mod_rewrite in Perl
  
  We can easily implement everything mod_rewrite does in Perl. We do
  this with help of PerlTransHandler, which is invoked at the beginning
  of request processing. For example consider that we need to perform a
  redirect based on query string and URI, the following handler does
  that.
  
    package Apache::MyRedirect;
    use Apache::Constants qw(OK REDIRECT);
    use constant DEFAULT_URI => 'http://www.example.org';
    
    sub handler {
      my $r    = shift;
      my %args = $r->args;
      my $path = $r->uri;
    
      my $uri = (($args{'uri'}) ? $args{'uri'} : DEFAULT_URI) . $path;
    
      $r->header_out(Location => $uri);
      $r->status(REDIRECT);
      $r->send_http_header;
    
      return OK;
    }
  
  Set it up in I<httpd.conf> as:
  
    PerlTransHandler Apache::MyRedirect
  
  The code consists of three parts: request data retrieval, deciding
  what to do based on this data and finally setting the headers and the
  status and issuing redirect.
  
  So if a client submits a request of this kind:
  
    http://www.example.com/news/?uri=http://www.example-2.com/
  
  C<$uri> will hold I<http://www.example-2.com/news/> and that's where
  the request will be redirected.
  
  
  
  =head1 URI Rewrite in PerlTransHandler 
  
  Suppose that before a content handler is invoked you want to make this
  translation:
  
    /articles/10/index.html  =>  /articles/index.html?id=10
  
  This I<TransHandler> will do that for you:
  
    My/Trans.pm
    -----------
    package My::Trans;
    use Apache::Constants qw(:common);
    sub handler {
      my $r = shift;
      my $uri = $r->uri;
      my ($id) = ($uri =~ m|^/articles/(.*?)/|);
      $r->uri("/articles/index.html");
      $r->args("id=$id");
      return DECLINED;
    }
    1;
  
  and in I<httpd.conf>:
  
    PerlModule My::Trans 
    PerlTransHandler My::Trans
  
  The handler code retrieves the request object and the URI. Then it
  retrieves the I<id> using the regular expression. Finally it sets the
  new value of the URI and the arguments string.  The handler returns
  C<DECLINED> so the default Apache transhandler will take care of URI
  to filename remapping.
  
  Notice the technique to set the arguments.  By the time the
  Apache-request object has been created, arguments are handled in a
  separate slot, so you cannot just push them into the original
  URI. Therefore the args() method should be used.
  
  
  =head1 Setting PerlHandler Based on MIME Type
  
  It's very easy to implement a dispatching module based on the MIME
  type of request. So a different content handler will be called for a
  different MIME type. This is an example of such a dispatcher:
  
    package My::MimeTypeDispatch;
    use Apache::Constants qw(DECLINED);
    
    my %mime_types = (
      'text/html'  => \&HTML::Template::handler,
      'text/plain' => \&My::Text::handler,
    );
    
    sub handler {
      my $r = shift;
      if (my $h = $mime_types{$r->content_type}) {
        $r->push_handlers(PerlHandler => $h);
        $r->handler('perl-script');
      }
    return DECLINED;
    }
    1;
    __END__
  
  And in I<httpd.conf> we add:
  
    PerlFixupHandler My::MimeTypeDispatch
  
  After declaring the package name and importing constants, we set a
  translation table of MIME types and corresponding handlers to be
  called. Then comes the handler, where the request object is retrieved
  and if its MIME type is found in our translation table we set the
  handler that should handle this request. Otherwise we do nothing.  At
  the end we return C<DECLINED> so some other fixup handler could take
  over.
  
  
  
  
  
  =head1 SSI and Embperl -- Doing Both
  
  This handler lets you use both SSI and Embperl in the same request:
  
  Use it in a C<E<lt>FilesMatchE<gt>> Section or similar:
  
    PerlModule Apache::EmbperlFilter Apache::SSI
    <FilesMatch "\.epl">
       PerlSetVar Filter On
       PerlHandler Apache::EmbperlFilter Apache::SSI
    </FilesMatch>
  
  
    package Apache::EmbperlFilter;
    
    use Apache::Util qw(parsedate);
    use HTML::Embperl;
    use Apache::SSI ();
    use Apache::Constants;
    
    use strict;
    use vars qw($VERSION);
    
    $VERSION = '0.03';
    my ($r, %param, $input, $output);
    
    sub handler {
        $r = shift;
        my ($fh, $status) = $r->filter_input();
        unless ($status == OK) {
            return $status
        }
        local $/ = undef;
        $input = scalar(<$fh>);
        %param = ();
        $param{input} = \$input;
        $param{req_rec} = $r;
        $param{output} = \$output;
        $param{mtime} = mtime();
        $param{inputfile} = $r->filename();
        HTML::Embperl::ScanEnvironement(\%param);
        HTML::Embperl::Execute(\%param);
        print $output;
        return OK;
    }
    
    sub mtime {
        my $mtime = undef;
        if (my $last_modified = $r->headers_out->{'Last-Modified'}) {
            $mtime = parsedate $last_modified;
        }
        $mtime;
    }
    
    1;
    __END__
  
  
  
  
  
  
  
  
  
  =head1 Getting the Front-end Server's Name in the Back-end Server
  
  Assume that you have more than one front-end server, and you want to
  dynamically figure out the front-end server name in the back-end
  server.  mod_proxy and mod_rewrite provide the solution.
  
  Compile apache with both mod_proxy and mod_rewrite, then use a
  directive something like this:
  
    RewriteEngine On
    RewriteLog /somewhere/rewrite.log
    RewriteLogLevel 3
    RewriteRule ^/foo/bar(.*)$ \
    http://example.com:8080/foo/bar/$1?IP=%{REMOTE_HOST} [QSA,P]
  
  This will have all the urls starting with I</some/url> proxied off to
  the other server at the same url.  It will append the C<REMOTE_HOST>
  header as a query string argument.  (QSA = Query String Append, P =
  Proxy).  There is probably a way to remap it as an X-Header of some
  sort, but if query string is good enough for you, then this should
  work really nicely.
  
  
  
  =head1 Authentication Snippets
  
  Getting the authenticated username: C<$r-E<gt>connection-E<gt>user()>,
  or C<$ENV{REMOTE_USER}> if you're in a CGI emulation.
  
  Example:
  
    my $r = shift;
    
    my ($res, $sent_pwd) = $r->get_basic_auth_pw;
    return $res if $res; #decline if not Basic
    
    my $user = $r->connection->user;
  
  
  =head1 Emulating the Authentication Mechanism
  
  You can provide your own mechanism to authenticate users, instead of
  the standard one.  If you want to make Apache think that the user was
  authenticated by the standard mechanism, set the username with:
  
    $r->connection->user('username');
  
  Now you can use this information for example during the logging, so 
  that you can have your "username" passed as if it was transmitted to 
  Apache through HTTP authentication.
  
  
  
  
  
  =head1 An example of using Apache::Session::DBI with cookies
  
  META: should be annotated at some point. (an example was posted to the
  mod_perl list)
  
    use strict;
    use DBI;
    use Apache::Session::DBI;
    use CGI;
    
    # [...]
    
    # Initiate a session ID
    my $session = ();
    my $opts = {  autocommit => 0, 
                  lifetime   => 3600 };     # 3600 is one hour
    
    # Read in the cookie if this is an old session
    my $r = Apache->request;
    my $no_cookie = '';
    my $cookie = $r->header_in('Cookie');
    {
        # eliminate logging from Apache::Session::DBI's use of `warn'
        local $^W = 0;      
    
        if (defined($cookie) && $cookie ne '') {
            $cookie =~ s/SESSION_ID=(\w*)/$1/;
            $session = Apache::Session::DBI->open($cookie, $opts);
            $no_cookie = 'Y' unless defined($session);
        }
        # Could have been obsolete - get a new one
        $session = Apache::Session::DBI->new($opts) unless defined($session);
    }
    
    # Might be a new session, so let's give them a cookie back
    if (! defined($cookie) || $no_cookie) {
        local $^W = 0;
    
        my $session_cookie = "SESSION_ID=$session->{'_ID'}";
        $r->header_out("Set-Cookie" => $session_cookie);
    }
    
  
  
  =head1 Using DESTROY to Finalize Output
  
  Well, as always with Perl -- TMTOWTDI (There's More Than One Way To Do
  It), one of the readers is using C<DESTROY> to finalize output, and as
  a cheap means of buffering.
  
    package buffer;
    use Apache;
    
    sub new {
       my $class = shift;
       my $self = bless {
           'r' => shift,
           'message' => ""
       }, $class;
       $self->{apr} = Apache::Request->new($self->{r},
                                           POST_MAX=>(32*1024));
       $self->content_type('text/plain');
       $self->{r}->no_cache(1);
    }
    
    sub message {
          my $self = shift;
          $self->{message} .= join("\n", @_);
    }
    
    sub DESTROY {
          my $self = shift;
          $self->{apr}->send_http_header;
          $self->{apr}->print($self->{message});
    }
    1;
  
  Now you can have perl scripts like:
  
    use buffer;
    my $b = new buffer(shift);
    
    $b->message(p("Hello World"));
    # end
  
  and save a bunch of duplicate code across otherwise inconvenient
  gaggles of small scripts.
  
  But suppose you also want to redirect the client under some
  circumstances, and send the HTTP status code 302.  You might try this:
  
    sub redir {
          my $self = shift;
          $self->{redirect} = shift;
          exit;
    }
  
  and re-code C<DESTROY> as:
  
    sub DESTROY {
          my $self = shift;
          if ($self->{redirect}) {
                  $self->{apr}->status{REDIRECT};
                  $self->{apr}->header_out("Location", $self->{redirect});
                  $self->{apr}->send_http_header;
                  $self->{apr}->print($self->{redirect});
          } else {
                  $self->{apr}->send_http_header;
                  $self->{apr}->print($self->{message});
          }
    }
  
  But you'll find that while the browser redirects itself, mod_perl
  logs the result code as 200.  It turns out that status() only touches
  the Apache response, and the log message is determined by the Apache
  return code.
  
  Aha!  So we'll change the exit() in redir() to exit(REDIRECT).  This
  fixes the log code, but causes a bogus I<"[error] 302"> line in the
  I<error_log>.  That comes from C<Apache::Registry>:
  
    my $errsv = "";
    if($@) {
        $errsv = $@;
        $@ = ''; #XXX fix me, if we don't do this Apache::exit() breaks
        [EMAIL PROTECTED] = $errsv;
    }
    
    if($errsv) {
        $r->log_error($errsv);
        return SERVER_ERROR unless $Debug && $Debug & 2;
        return Apache::Debug::dump($r, SERVER_ERROR);
    }
  
  So you see that any time the return code causes C<$@> to return true,
  we'll get an error line.  Not wanting this, what can we do?
  
  We can hope that a future version of mod_perl will allow us to set the
  HTTP result code independent from the handler return code (perhaps a
  log_status() method? or at least an C<Apache::LOG_HANDLER_RESULT>
  config variable?).
  
  In the meantime, there's
  L<Apache::RedirectLogFix|snippets/Code_Snippets_>.
  
  Put this in your I<httpd.conf>
  
    PerlLogHandler Apache::RedirectLogFix
  
  and take a look at the source code below.  Note that it requires us to
  return the HTTP status code 200.
  
    package Apache::RedirectLogFix;
    
    use Apache::Constants qw(OK DECLINED REDIRECT);
    
    sub handler {
       my $r = shift;
       return DECLINED unless $r->handler && ($r->handler eq "perl-script");
    
       if(my $loc = $r->header_out("Location")) {
           if($r->status == 200 and substr($loc, 0, 1) ne "/") {
               $r->status(REDIRECT);
               return OK
           }
       }
       return DECLINED;
    }
    
    1;
  
  Now, if we wanted to do the same sort of thing for an error 500
  handler, we could write another C<PerlLogHandler> (call it
  C<ServerErrorLogFix>).  But we'll leave that as an exercise for the
  reader, and hope that it won't be needed in the next mod_perl release.
  After all, it's a little awkward to need a C<LogHandler> to clean up
  after ourselves....
  
  
  
  =head1 Setting Environment Variables For Scripts Called From CGI.
  
  Perl uses C<sh()> for its C<system()> and C<open()> calls.  So if you
  want to set a temporary variable when you call a script from your CGI
  you do something like this:
  
   open UTIL, "USER=stas ; script.pl | " or die "...: $!\n";
  
  or
  
    system "USER=stas ; script.pl";
  
  This is useful, for example, if you need to invoke a script that uses
  CGI.pm from within a mod_perl script.  We are tricking the Perl script
  into thinking it's a simple CGI, which is not running under mod_perl.
  
    open(PUBLISH, "GATEWAY_INTERFACE=CGI/1.1 ; script.cgi
         \"param1=value1&param2=value2\" |") or die "...: $!\n";
  
  Make sure that the parameters you pass are shell safe -- all "unsafe"
  characters like single-quote and back-tick should be properly escaped.
  
  Unfortunately mod_perl uses fork() to run the script, so you have
  probably thrown out the window most of the performance gained from
  using mod_perl.  To avoid the fork, change script.cgi to a module
  containing a subroutine which you can then call directly from your
  mod_perl script.
  
  
  
  =head1 Mysql Backup and Restore Scripts
  
  This is somewhat off-topic, but since many of us use mysql or some
  other RDBMS in their work with mod_perl driven sites, it's good to
  know how to backup and restore the databases in case of database
  corruption.
  
  First we should tell mysql to log all the clauses that modify the
  databases (we don't care about SELECT queries for database
  backups).  Modify the C<safe_mysql> script by adding the
  I<--log-update> options to the C<mysql> server startup parameters and
  restart the server.  From now on all the non-select queries will be
  logged to the I</var/lib/mysql/www.bar.com> logfile.  Your hostname 
  will show up instead of I<www.bar.com>.
  
  Now create a I<dump> directory under I</var/lib/mysql/>.  That's where
  the backups will be stored (you can name the directory as you wish of
  course).
  
  Prepare the backup script and store it in a file, e.g:
  I</usr/local/sbin/mysql/mysql.backup.pl>
  
  =code mysql-3.22.29_backup.pl This is the original code 
  
  =code mysql-3.22.30+_backup.pl This is the code modified to work with
  mysql-3.22.30+ 
  
  
  You might need to change the executable paths according to your
  system.  List the names of the databases you want to backup using
  the C<db_names> array.
  
  Now make the script executable and arrange the crontab entry to run
  the backup script nightly.  Note that the disk space used by the
  backups will grow without bound and you should remove the old backups.
  Here is a sample crontab entry to run the script at 4am every day:
  
    0 4 * * * /usr/local/sbin/mysql/mysql.backup.pl > /dev/null 2>&1
  
  So now at any moment we have the dump of the databases from the last
  execution of the backup script and the log file of all the clauses
  that have updated the databases since then.  If the database gets
  corrupted we have all the information to restore it to the state it
  was in at our last backup.  We restore it with the following script,
  which I put in: I</usr/local/sbin/mysql/mysql.restore.pl>
  
  =code mysql-3.22.29_restore.pl This is the original code
  
  =code mysql-3.22.30+_restore.pl This is the code modified to work with
  mysql-3.22.30+
  
  These are kinda dirty scripts, but they work... if you come up with 
  cleaner scripts, please contribute them... thanks
  
  Update: there is now a "mysqlhotcopy" utility distributed with MySQL
  that can make an atomic snapshot of a database. (by Tim Bunce) So you
  may consider using it instead.
  
  
  =cut
  
  
  
  
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/start.pod
  
  Index: start.pod
  ===================================================================
  =head1 NAME
  
  Guide's Overview
  
  =head1 What's inside?
  
  Before you start a mod_perl installation, you should have an
  overall picture of this wonderful technology. There is more than one
  way to use a mod_perl-enabled webserver. You have to decide which
  mod_perl scheme you want to use. L<Picking the Right
  Strategy|strategy/> chapter presents various approaches and discusses
  their pros and cons.
  
  Once you know what best fits your requirements, you should proceed
  to L<Real World Scenarios Implementation|scenario/>.  This chapter
  provides very detailed scenarios of the schemes discussed in the
  L<Choosing the Right Strategy|strategy/> chapter.
  
  The L<Server Installation|install/> chapter follows on from the L<Real
  World Scenarios Implementation|scenario/> chapter by providing more
  in-depth installation details.
  
  The L<Server Configuration|config/> chapter adds to the basic
  configurations presented in the L<Real World Scenarios
  Implementation|scenario/> chapter with extended configurations and
  various configuration examples.
  
  The L<Frequent mod_perl problems|frequent/> chapter is a collection of
  links to other chapters. It is an attempt to stress some of the most
  frequently encountered mod_perl problems. So this is the first place
  you should check if you have got a problem.
  
  Probably the most important chapter is L<CGI to mod_perl
  Porting. mod_perl Coding guidelines|porting/>. It explains the
  differences between scripts running under mod_cgi and mod_perl, and
  what should be done in order to make existing scripts run under
  mod_perl. Along with the porting notes it provides guidelines for
  proper mod_perl programming.
  
  L<Performance. Benchmarks|performance/> is the biggest and a very
  important chapter. It explains the details of tuning mod_perl and the
  scripts running under it, so you can squeeze every ounce of the power
  from your server. A large part of the chapter is benchmarks, the
  numbers that IT managers love to read. But these are different
  benchmarks: they are not comparing mod_perl with similar technologies,
  rather with different configurations of mod_perl servers, to guide you
  through the tuning process. I have to admit, performance tuning is a
  very hard task, and demands a lot of understanding and experience.
  But once you acquire this knowledge you can make magic with your
  server.
  
  While developing your mod_perl applications, you will begin to
  understand that an C<error_log> file is your best friend. It tells you
  all the intimate details of what is happening to your scripts.  But
  the problem is that it speaks a secret language. To learn the alphabet
  and the grammar of this language, refer to the chapter L<Warnings and
  Errors: Where and Why|troubleshooting/>.
  
  L<Protecting Your Site|security/> - All about security.
  
  If you are into driving relational databases with your cgi scripts,
  the L<mod_perl and Relational Databases|databases/> chapter will tell
  you all about the database-related goodies mod_perl has prepared for
  you.
  
  If you are using good old dbm files for your databases, the L<mod_perl
  and dbm files|dbm/> chapter explains how to utilize them better under
  mod_perl.
  
  More and more Internet Service Providers (ISPs) are evaluating the
  possibility of providing mod_perl services to their users. Is this
  possible? Is it secure? Will it work?  What resources does it take?
  The L<mod_perl for ISPs. mod_perl and Virtual Hosts|multiuser/>
  chapter answers all these questions. If you want to run a mod_perl-
  enabled server, but do not have root access, read this chapter as
  well, either to learn how to do it yourself, or maybe to persuade your
  ISP to provide this service.
  
  If you have to administer your Apache mod_perl server the
  L<Controlling and Monitoring the Server|control/> chapter is for
  you. Among the topics are: server restarting and monitoring
  techniques, preventing the server from eating up all your disk space
  in a matter of minutes, and more.
  
  (META: fix this)
  The L<mod_perl Status. Peeking into the Server's Perl 
Innards|config/mod_perl_Configuration_>
  chapter shows you the ways you can peek at what is going on in a
  mod_perl-enabled server while it is running. Like looking at the value
  of some global variable, what database connections are open, looking
  up what modules are loaded and their paths, what is the value of
  C<@INC>, and much more.
  
  Every programmer needs to know how to debug her program. It is an
  _easy_ task with plain Perl. Just invoke the program with the C<-d>
  flag and debug it. Is it possible to do the same under mod_perl? After
  all you cannot debug every CGI script by executing it from the command
  line: some scripts will not run from the command line. The L<Debugging
  mod_perl|debug/> chapter proves debugging under mod_perl is possible.
  
  Sometimes browsers that interact with our servers have bugs, which
  cause big headaches for CGI developers. Coping with these bugs 
  is discussed in the L<Workarounds for some known bugs in
  browsers|browserbugs/> chapter.
  
  Many modules were written to extend mod_perl's core
  functionality. Some important modules are covered in the L<Apache::*
  modules|modules/> chapter.
  
  Some folks decide to go with mod_perl even though they are not
  experienced Perl programmers. mod_perl is a demanding environment
  which does not permit the `casual' programming style which plain Perl
  allows.  Lack of knowledge/experience with Perl need not be any
  barrier; Perl comes with copious and high quality on-line
  documentation and there are many Perl books available which will get
  you up to speed. Get a good Perl book and start reading. The L<Perl
  Reference|perl/> chapter gives some basic, mod_perl specific Perl
  lessons, delivering the knowledge without which you cannot start to
  program mod_perl scripts.
  
  The L<Code Snippets|snippets/> chapter is just a collection of code
  snippets I have found useful while writing the scripts.
  
  The L<Choosing an Operating System and Hardware|hardware/> chapter
  gives you an idea on how to choose the software and hardware for the
  webserver.
  
  The L<mod_perl Advocacy|advocacy/mod_perl_Advocacy_> tries to make it
  easier to advocate mod_perl around the world.
  
  The L<Getting Help and Further Learning|help/> chapter refers you to
  other related information resources, like learning Perl programming
  and SQL, understanding security, building databases, and more.
  
  L<Appendix A: Downloading software and documentation|download/>
  includes pointers to the software that was explained and/or mentioned
  in this guide.
  
  =cut
  
  
  
  
  
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/strategy.pod
  
  Index: strategy.pod
  ===================================================================
  =head1 NAME
  
  Choosing the Right Strategy
  
  =head1 Do it like I do it!?
  
  There is no such thing as the B<right> strategy in the web server
  business, although there are many wrong ones. Never believe a person
  who says: I<"Do it this way, this is the best!">. As the old saying
  goes: I<"Trust but verify">. There are too many technologies out there
  to choose from, and it would take an enormous investment of time and
  money to try to validate each one before deciding which is the best
  choice for your situation.
  
  With this in mind, I will present some ways of using standalone
  mod_perl, and some combinations of mod_perl and other technologies.
  I'll describe how these things work together, offer my opinions on the
  pros and cons of each, the relative degree of difficulty in installing
  and maintaining them, and some hints on approaches that should be used
  and things to avoid.
  
  To be clear, I will not address all technologies and tools, but limit
  this discussion to those complementing mod_perl.
  
  Please let me stress it again: B<do not> blindly copy someone's setup
  and hope for a good result. Choose what is best for your situation --
  it might take B<some> effort to find out what that is.
  
  In this chapter we will discuss
  
  =over
  
  =item * Deployment of mod_perl in Overview, with the pros and cons.
  
  =item * Alternative architectures for running one and two servers.
  
  =item * Proxy servers (Squid, and Apache's mod_proxy).
  
  =back
  
  
  
  
  
  
  =head1 mod_perl Deployment Overview
  
  There are several different ways to build, configure and deploy your
  mod_perl enabled server. Some of them are:
  
  =over 8
  
  =item 1
  
  Having one binary and one configuration file (one big binary for mod_perl).
  
  =item 2
  
  Having two binaries and two configuration files (one big binary for mod_perl
  and one small binary for static objects like images.)
  
  =item 3
  
  Having one DSO-style binary and two configuration files, with mod_perl
  available as a loadable object.
  
  =item 4
  
  Any of the above plus a reverse proxy server in http accelerator mode.
  
  =back
  
  If you are a newbie, I would recommend that you start with the first
  option and work on getting your feet wet with Apache and mod_perl.
  Later, you can decide whether to move to the second one which allows
  better tuning at the expense of more complicated administration, or to
  the third option -- the more state-of-the-art-yet-suspiciously-new DSO
  system, or to the fourth option which gives you even more power.
  
  =over 8
  
  =item 1
  
  The first option will kill your production site if you serve a lot of
  static data from large (4 to 15MB) webserver processes. On the other hand,
  while testing you will have no other server interaction to mask or add
  to your errors.
  
  =item 2
  
  This option allows you to tune the two servers individually, for
  maximum performance.
  
  However, you need to choose between running the two servers on
  multiple ports, multiple IPs, etc., and you have the burden of
  administering more than one server.  You have to deal with proxying or
  fancy site design to keep the two servers in synchronization.
  
  =item 3
  
  With DSO, modules can be added and removed without recompiling the
  server, and their code is even shared among multiple servers.
  
  You can compile just once and yet have more than one binary, by using
  different configuration files to load different sets of modules.  The
  different Apache servers loaded in this way can run simultaneously to
  give a setup such as described in the second option above.
  
  On the down side, you are playing at the bleeding edge.
  
  You are dealing with a new solution that has weak documentation and is
  still subject to change.  It is still somewhat platform specific.
  Your mileage may vary.
  
  The DSO module (C<mod_so>) adds size and complexity to your binaries.
  
  Refer to the section L<"Pros and Cons of Building mod_perl as
  DSO|install/Pros_and_Cons_of_Building_mod_pe"> for more information.
  
  Build details: L<Build mod_perl as DSO inside Apache source tree via
  APACI|install/Build_mod_perl_as_a_DSO_inside_t>
  
  
  
  =item 4
  
  The fourth option (proxy in http accelerator mode), once correctly
  configured and tuned, improves the performance of any of the above
  three options by caching and buffering page results.
  
  =back
  
  
  
  
  
  
  
  
  =head1 Alternative architectures for running one and two servers
  
  The next part of this chapter discusses the pros and the cons of each
  of these presented configurations.  L<Real World Scenarios
  Implementation|scenario/> describes the implementation techniques of
  these schemes.
  
  We will look at the following installations:
  
  =over
  
  =item * Standalone mod_perl Enabled Apache Server
  
  =item * One Plain Apache and One mod_perl-enabled Apache Servers
  
  =item * One light non-Apache and One mod_perl enabled Apache Servers
  
  =item * Adding a Proxy Server in http Accelerator Mode
  
  =back
  
  
  =head2 Standalone mod_perl Enabled Apache Server
  
  The first approach is to implement a straightforward mod_perl
  server. Just take your plain Apache server and add mod_perl, like you
  add any other Apache module. You continue to run it at the port it was
  using before. You probably want to try this before you proceed to
  more sophisticated and complex techniques.
  
  The advantages:
  
  =over 4
  
  =item *
  
  Simplicity. You just follow the installation instructions, configure
  it, restart the server and you are done.
  
  =item *
  
  No network changes.  You do not have to worry about using additional
  ports as we will see later.
  
  =item *
  
  Speed. You get a very fast server and you see an enormous speedup from
  the first moment you start to use it.
  
  =back
  
  The disadvantages:
  
  =over 4
  
  =item *
  
  The process size of a mod_perl-enabled Apache server is huge (maybe
  4MB at startup and growing to 10MB and more, depending on how you use
  it) compared to typical plain Apache. Of course if memory sharing is
  in place, RAM requirements will be smaller.
  
  You probably have a few tens of child processes.  The additional
  memory requirements add up in direct relation to the number of child
  processes. Your memory demands are growing by an order of magnitude,
  but this is the price you pay for the additional performance boost of
  mod_perl.  With memory prices so cheap nowadays, the additional cost
  is low -- especially when you consider the dramatic performance boost
  mod_perl gives to your services with every 100MB of RAM you add.
  
  While you will be happy to have these monster processes serving your
  scripts with monster speed, you should be very worried about having
  them serve static objects such as images and html files.  Each static
  request served by a mod_perl-enabled server means another large
  process running, competing for system resources such as memory and CPU
  cycles.  The real overhead depends on the static object request rate.
  Remember that if your mod_perl code produces HTML code which includes
  images, each one will turn into another static object request. Having
  another plain webserver to serve the static objects solves this
  unpleasant obstacle. Having a proxy server as a front end, caching the
  static objects and freeing the mod_perl processes from this burden is
  another solution. We will discuss both below.
  
  =item *
  
  Another drawback of this approach is that when serving output to a
  client with a slow connection, the huge mod_perl-enabled server
  process (with all of its system resources) will be tied up until the
  response is completely written to the client.  While it might take a
  few milliseconds for your script to complete the request, there is a
  chance it will be still busy for some number of seconds or even
  minutes if the request is from a slow connection client. As in the
  previous drawback, a proxy solution can solve this problem. More on
  proxies later.
  
  Proxying dynamic content is not going to help much if all the clients
  are on a fast local net (for example, if you are administering an
  Intranet.) On the contrary, it can decrease performance. Still,
  remember that some of your Intranet users might work from home through
  slow modem links.
  
  =back
  
  If you are new to mod_perl, this is probably the best way to get
  yourself started.
  
  And of course, if your site is serving only mod_perl scripts (close to
  zero static objects, like images), this might be the perfect choice
  for you!
  
  For implementation notes, see the "L<One Plain and One mod_perl
  enabled Apache Servers|scenario/One_Plain_and_One_mod_perl_enabl>"
  section in the implementations chapter.
  
  
  
  =head2 One Plain Apache and One mod_perl-enabled Apache Servers
  
  As I have mentioned before, when running scripts under mod_perl you
  will notice that the httpd processes consume a huge amount of virtual
  memory -- from 5MB to 15MB and even more.  That is the price you pay
  for the enormous speed improvements under mod_perl. (Again -- shared
  memory keeps the real memory that is being used much smaller :)
  
  Using these large processes to serve static objects like images and
  html documents is overkill. A better approach is to run two servers: a
  very light, plain Apache server to serve static objects and a heavier
  mod_perl-enabled Apache server to serve requests for dynamic
  (generated) objects (aka CGI).
  
  From here on, I will refer to these two servers as B<httpd_docs>
  (vanilla Apache) and B<httpd_perl> (mod_perl enabled Apache).
  
  The advantages:
  
  =over 4
  
  =item *
  
  The heavy mod_perl processes serve only dynamic requests, which allows
  the deployment of fewer of these large servers.
  
  =item *
  
  C<MaxClients>, C<MaxRequestsPerChild> and related parameters can now
  be optimally tuned for both C<httpd_docs> and C<httpd_perl> servers,
  something we could not do before. This allows us to fine tune the
  memory usage and get better server performance.
  
  Now we can run many lightweight C<httpd_docs> servers and just a few
  heavy C<httpd_perl> servers.
  
  =back
  
  An B<important> note: When a user browses static pages and the base
  URL in the B<Location> window points to the static server, for example
  C<http://www.example.com/index.html> -- all relative URLs
  (e.g. C<E<lt>A HREF="/main/download.html"E<gt>>) are being served by
  the light plain Apache server. But this is not the case with
  dynamically generated pages. For example when the base URL in the
  B<Location> window points to the dynamic server --
  (e.g. C<http://www.example.com:8080/perl/index.pl>) all relative URLs
  in the dynamically generated HTML will be served by the heavy mod_perl
  processes. You must use fully qualified URLs and not relative ones!
  C<http://www.example.com/icons/arrow.gif> is a full URL, while
  C</icons/arrow.gif> is a relative one. Using C<E<lt>BASE
  HREF="http://www.example.com/"E<gt>> in the generated HTML is another
  way to handle this problem. Also, the C<httpd_perl> server could
  rewrite the requests back to C<httpd_docs> (much slower) and you still
  need the attention of the heavy servers.  This is not an issue if you
  hide the internal port implementations, so the client sees only one
  server running on port C<80>. (See L<Publishing Port Numbers other
  than
  80|config/Publishing_Port_Numbers_other_th>)
  
  
  The disadvantages:
  
  =over 4
  
  =item *
  
  An administration overhead.
  
  =over 4
  
  =item *
  
  The need for two configuration files.
  
  =item *
  
  The need for two sets of controlling scripts (startup/shutdown) and
  watchdogs.
  
  =item *
  
  If you are processing log files, now you probably will have to merge
  the two separate log files into one before processing them.
  
  =back
  
  =item *
  
  Just as in the one server approach, we still have the problem of a
  mod_perl process spending its precious time serving slow clients, when
  the processing portion of the request was completed a long time ago.
  Deploying a proxy solves this, and will be covered in the next
  section.
  
  As with the single server approach, this is not a major disadvantage
  if you are on a fast network (i.e. Intranet). It is likely that you do
  not want a buffering server in this case.
  
  =back
  
  Before you go on with this solution you really want to look at the
  L<Adding a Proxy Server in http Accelerator
  Mode|strategy/Adding_a_Proxy_Server_in_http_Ac> section.
  
  For implementation notes see the "L<One Plain and One mod_perl enabled
  Apache Servers|scenario/One_Plain_and_One_mod_perl_enabl>" section in
  implementations chapter.
  
  =head2 One light non-Apache and One mod_perl enabled Apache Servers
  
  If the only requirement from the light server is for it to serve
  static objects, then you can get away with non-Apache servers having
  an even smaller memory footprint. C<thttpd> has been reported to be
  about 5 times faster then Apache (especially under a heavy load),
  since it is very simple and uses almost no memory (260K) and does not
  spawn child processes.
  
  
  The Advantages:
  
  =over 4
  
  =item *
  
  All the advantages of the 2 servers scenario.
  
  =item *
  
  More memory saving. Apache is about 4 times bigger than B<thttpd>, if
  you spawn 30 children you use about 30M of memory, while B<thttpd>
  uses only 260K - 100 times less! You could use the 30M you've saved to
  run a few more mod_perl servers.
  
  The memory savings are significantly smaller if your OS supports
  memory sharing with Dynamically Shared Objects (DSO) and you have
  configured Apache to use it.  If you do allow memory sharing, 30 light
  Apache servers ought to use only about 3 to 4MB, because most of it
  will be shared.  There is no memory sharing if Apache modules are
  statically compiled into the httpd executable.
  
  =item *
  
  Reported to be about 5 times faster then plain Apache serving static
  objects.
  
  =back
  
  
  The Disadvantages:
  
  =over 4
  
  =item *
  
  Lacks some of Apache's features, like access control, error
  redirection, customizable log file formats, and so on.
  
  =back
  
  
  Another interesting choice is a kHTTPd webserver for Linux. kHTTPd is
  different from other webservers in that it runs from within the
  Linux-kernel as a module (device-driver).  kHTTPd handles only static
  (file based) web-pages, and passes all requests for non-static
  information to a regular userspace-webserver such as Apache. For more
  information see http://www.fenrus.demon.nl/.
  
  Also check out the Boa webserver: http://www.boa.org/
  
  
  
  =head1 Adding a Proxy Server in http Accelerator Mode
  
  At the beginning there were two servers: one plain Apache server,
  which was I<very light>, and configured to serve static objects, the
  other mod_perl enabled (I<very heavy>) and configured to serve
  mod_perl scripts and handlers.  As you remember we named them
  C<httpd_docs> and C<httpd_perl> respectively.
  
  In the dual-server setup presented earlier the two servers coexist at
  the same IP address by listening to different ports: C<httpd_docs>
  listens to port 80 (e.g. http://www.example.com/images/test.gif) and
  C<httpd_perl> listens to port 8080 (e.g.
  http://www.example.com:8080/perl/test.pl).  Note that we did not write
  http://www.example.com:80 for the first example, since port 80 is the
  default port for the http service. Later on, we will be changing the
  configuration of the C<httpd_docs> server to make it listen to port
  81.
  
  This section will attempt to convince you that you really B<want> to
  deploy a proxy server in the http accelerator mode. This is a special
  mode that in addition to providing the normal caching mechanism,
  accelerates your CGI and mod_perl scripts.
  
  The advantages of using the proxy server in conjunction with mod_perl
  are:
  
  =over 8
  
  =item *
  
  Certainly the benefits of the usual use of the proxy server which
  allows serving of static objects from the proxy's cache. You get less
  I/O activity reading static objects from the disk (proxy serves the
  most "popular" objects from RAM -- of course you benefit more if you
  allow the proxy server to consume more RAM). Since you do not wait for
  the I/O to be completed, you are able to serve static objects much
  faster.
  
  =item *
  
  And the extra functionality provided by the http-acceleration mode,
  which makes the proxy server act as a sort of output buffer for the
  dynamic content.  The mod_perl server sends the entire response to the
  proxy and is then free to deal with other requests.  The proxy server
  is responsible for sending the response to the browser.  So if the
  transfer is over a slow link, the mod_perl server is not waiting
  around for the data to move.
  
  Using numbers is always more convincing than just words. So we are
  going to show a simple example from the real world.
  
  First let's explain the abbreviation used in the networking world. If
  someone claims to have a 56 kbps connection -- it means that the
  connection is of 56 kilo-bits per second (~56000 bits/sec). It's not
  56 kilo-bytes per second, but 7 kilo-bytes per second, because 1
  byte equals to 8 bits. So don't let the merchants fool you--your modem
  gives you 7 kilo-bytes per second connection at most and not 56
  kilo-bytes per second as one might think.
  
  Another convention used in computer literature is that if you see 10Kb
  it usually means 10 kilo-bits and 10KB is 10 kilo-bytes. So if you
  see upper case C<B> it generally refers to bytes, and lower case C<b>
  to bits (and C<K> of course means kilo and equals to 1024 or to 1000
  depending on the field it's used in). Remember that the latter
  convention is not followed everywhere, so use this knowledge with
  care. This document is following this convention.
  
  So here is the real world example. Let's look at the typical scenario
  with a user connected to your site with 56Kbps modem. It means that
  the speed of the user's link is 56/8 = 7KBytes per sec. Let's assume
  an average generated HTML page to be of 42KB and an average mod_perl
  script that generates this response in 0.5 second. How many responses
  this script could produce during the time it took for the output to be
  delivered to user? A simple calculation reveals pretty scary numbers:
  
    42KB / (0.5s * 7KB/s) = 12
  
  12 other dynamic requests could be served at the same time, if we
  could put mod_perl to do only what it's best at: generating responses.
  
  This very simple example shows us that we need only one twelfth the
  number of children running, which means that we will need only one
  twelfth of the memory (not quite true because some parts of the code
  are shared).
  
  But you know that nowadays scripts often return pages which are blown
  up with JavaScript code and similar, which can make them 100kb size
  and the download time will be of the order of... (This calculation is
  left to you as an exercise :)
  
  Moreover many users like to open many browser windows and do many
  things at once (download files and browse graphically I<heavy>
  sites). So in the speed of 7KB/sec we have assumed before, may in
  reality be 5-10 times slower.
  
  =item *
  
  This technique allows us to hide the details of the server's
  implementation. Users will never see ports in the URLs (more on that
  topic later). You can have a few boxes serving the requests, and only
  one serving as a front end, which spreads the jobs between the servers
  in a way that you can control.  You can actually shut down a server,
  without the user even noticing, because the front end server will
  dispatch the jobs to other servers. (This is called I<Load Balancing>
  and it's a pretty big issue which will take a book on its own to cover
  and therefore will not be discussed here. There is a plenty of
  information available at the Internet though. For more information see
  L<'High-Availability Linux
  Project'|download/High_Availability_Linux_Project>)
  
  =item *
  
  For security reasons, using any httpd accelerator (or a proxy in httpd
  accelerator mode) is essential because you do not let your internal
  server get directly attacked by arbitrary packets from whomever.  The
  httpd accelerator and internal server communicate in expected HTTP
  requests. This allows for only your public "bastion" accelerating www
  server to get hosed in a successful attack, while leaving your
  internal data safe.
  
  This is true if your server runs on your localhost (127.0.0.1) which
  makes it impossible to connect to you back end machine from the
  outside. But you don't need to connect from the outside anymore. You
  will see why when you proceed to this techniques' implementation
  notes.
  
  =back
  
  The disadvantages are:
  
  =over 8
  
  =item *
  
  Of course there are drawbacks. Luckily, these are not functionality
  drawbacks, but they are more administration hassle. You have another
  daemon to worry about, and while proxies are generally stable, you
  have to make sure to prepare proper startup and shutdown scripts,
  which are run at boot and reboot as appropriate. This is something
  that you do once and never come back to this issue again. Also, you
  might want to set up the crontab to run a watchdog script that will
  make sure that the proxying server is running and restart it if it
  detects a problem, reporting the problem to the administrator on the
  way.
  
  =item *
  
  Proxy servers can be configured to be light or heavy. The
  administrator must decide what gives the highest performance for his
  application. A proxy server like I<squid> is light in the sense of
  having only one process serving all requests. But it can consume a lot
  of memory when it loads objects into memory for faster service.
  
  =item *
  
  If you use the default logging mechanism for all requests on the
  front- and back-end servers the requests that will be forwarded to the
  back-end server will be logged twice, which makes it tricky to merge
  the two logfiles, should you want to.
  
  One solution is to tell the heavy Apache not to bother logging
  requests that seem to come from the light Apache's machine.  You might
  do this by installing a custom C<PerlLogHandler> or just piping to
  I<access_log> via C<grep -v> (match all but this pattern) for the
  requests coming from the light Apache server.  In this scenario, the
  I<access_log> written by the light Apache is the file you should work
  with.  But you need to look for any direct accesses to the heavy
  server in case the proxy server is sometimes bypassed, which can be
  eliminated if the server is listening only to the localhost
  (127.0.0.1).
  
  If you still decide to log proxied requests at the back-end server
  they will be useless since instead of real remote IP of the user, you
  will get always the same IP of the front-end server. Later in this
  Chapter on page XXX (mod_proxy_add_forward) we present a solution for
  this problem.
  
  
  =back
  
  Have I succeeded in convincing you that you want a proxy server?
  
  Of course if you are on a very fast local area network (LAN) (which
  means that all your users are connected from this LAN and not from the
  outside), then the big benefit of the proxy buffering the output and
  feeding a slow client is gone.  You are probably better off sticking
  with a straight mod_perl server in this case.
  
  
  =head1 Implementations of Proxy Servers
  
  As of this writing, two proxy implementations are known to be widely
  used with mod_perl, the B<squid> proxy server and B<mod_proxy> which is a
  part of the Apache server.  Let's compare them.
  
  =head2 The Squid Server
  
  The Advantages:
  
  =over 4
  
  =item *
  
  Caching of static objects.  These are served much faster, assuming
  that your cache size is big enough to keep the most frequently
  requested objects in the cache.
  
  =item *
  
  Buffering of dynamic content, by taking the burden of returning the
  content generated by mod_perl servers to slow clients, thus freeing
  mod_perl servers from waiting for the slow clients to download the
  data. Freed servers immediately switch to serve other requests, thus
  your number of required servers goes down dramatically.
  
  =item *
  
  Non-linear URL space / server setup. You can use Squid to play some
  tricks with the URL space and/or domain based virtual server support.
  
  =back
  
  The Disadvantages:
  
  =over 4
  
  =item *
  
  Proxying dynamic content is not going to help much if all the clients
  are on a fast local net.  Also, by default squid only buffers in 16KB
  chunks so it would not allow mod_perl to complete immediately if the
  output is larger. (C<READ_AHEAD_GAP> which is 16KB by default, can be
  enlarged in defines.h if your OS allows that).
  
  =item *
  
  Speed. Squid is not very fast today when compared with the plain file
  based web servers available. Only if you are using a lot of dynamic
  features such as mod_perl or similar is there a reason to use Squid,
  and then only if the application and the server are designed with
  caching in mind.
  
  =item *
  
  Memory usage. Squid uses quite a bit of memory.
  
  META: more details?
  
  =item *
  
  HTTP protocol level. Squid is pretty much a C<HTTP/1.0> server, which
  seriously limits the deployment of C<HTTP/1.1> features.
  
  =item *
  
  HTTP headers, dates and freshness. The squid server might give out
  stale pages, confusing downstream/client caches. (You update some
  documents on the site, but squid will still serve the old ones.)
  
  =item *
  
  Stability. Compared to plain web servers, Squid is not the most stable.
  
  =back
  
  The pros and cons presented above lead to the idea that you might want
  to use squid for its dynamic content buffering features, but only if
  your server serves mostly dynamic requests. So in this situation, when
  performance is the goal, it is better to have a plain Apache server
  serving static objects, and squid proxying the mod_perl enabled server
  only.
  
  For implementation details, see the sections L<Running One Webserver
  and Squid in httpd Accelerator
  Mode|scenario/Running_One_Webserver_and_Squid_> and L<Running Two
  Webservers and Squid in httpd Accelerator
  Mode|scenario/Running_Two_webservers_and_Squid> in the implementations
  chapter.
  
  
  =head2 Apache's mod_proxy
  
  I do not think the difference in speed between Apache's B<mod_proxy>
  and B<squid> is relevant for most sites, since the real value of what
  they do is buffering for slow client connections.  However, squid runs
  as a single process and probably consumes fewer system resources.
  
  The trade-off is that mod_rewrite is easy to use if you want to spread
  parts of the site across different back end servers, while mod_proxy
  knows how to fix up redirects containing the back-end server's idea of
  the location.  With squid you can run a redirector process to proxy to
  more than one back end, but there is a problem in fixing redirects in
  a way that keeps the client's view of both server names and port
  numbers in all cases.
  
  The difficult case is where you have DNS aliases that map to the same
  IP address and you want the redirect to port 80 and the server is on a
  different port and you want to keep the specific name the browser has
  already sent, so that it does not change in the client's B<Location>
  window.
  
  
  The Advantages:
  
  =over 4
  
  =item *
  
  No additional server is needed. We keep the one plain plus one
  mod_perl enabled Apache servers. All you need is to enable
  C<mod_proxy> in the C<httpd_docs> server and add a few lines to
  C<httpd.conf> file.
  
  =item *
  
  The C<ProxyPass> and C<ProxyPassReverse> directives allow you to hide
  the internal redirects, so if C<http://example.com/modperl/> is
  actually C<http://localhost:81/modperl/>, it will be absolutely
  transparent to the user. C<ProxyPass> redirects the request to the
  mod_perl server, and when it gets the response, C<ProxyPassReverse>
  rewrites the URL back to the original one, e.g:
  
    ProxyPass        /modperl/ http://localhost:81/modperl/
    ProxyPassReverse /modperl/ http://localhost:81/modperl/
  
  =item *
  
  It does mod_perl output buffering like squid does. See the L<Using
  mod_proxy|scenario/mod_proxy> notes for more details.
  
  =item *
  
  It even does caching. You have to produce correct C<Content-Length>,
  C<Last-Modified> and C<Expires> http headers for it to work. If some
  of your dynamic content does not change frequently, you can
  dramatically increase performance by caching it with mod_proxy.
  
  =item *
  
  C<ProxyPass> happens before the authentication phase, so you do not
  have to worry about authenticating twice.
  
  =item *
  
  Apache is able to accelerate secure HTTP requests completely, while also
  doing accelerated HTTP.  With Squid you have to use an external redirection
  program for that.
  
  =item *
  
  The latest (Apache 1.3.6 and later) Apache proxy accelerated module is
  reported to be very stable.
  
  =back
  
  For implementation see the "L<Using
  mod_proxy|scenario/mod_proxy>" section in the 
  implementation chapter.
  
  
  =head2 Closing Lingering Connections with Lingerd
  
  Because of some technical complications in TCP/IP, at the end of each
  client connection, it is not enough for Apache to close the socket and
  forget about it; instead, it needs to spend about one second
  I<lingering> on the client.  (More details can be found at
  http://httpd.apache.org/docs/misc/fin_wait_2.html)
  
  Lingerd is a daemon (service) designed to take over the job of
  properly closing network connections from an http server like Apache
  and immediately freeing it to handle a new connection.
  
  C<lingerd> can only do an effective job if HTTP C<Keep-Alive>s are
  turned off; since C<Keep-Alive>s are useful for images, the
  recommended setup is to have C<lingerd> serving mod_perl enabled
  Apache and plain Apache for images and other static objects.
  
  With a C<lingerd> setup, you don't have the proxy, so the buffering
  chain we have presented before for the proxy setup is much shorter
  here:
  
  FIGURE:
  
   |   Apache     Kernel   |TCP/IP  `o'
   | [mod_perl]=>[sendbuf] |======> /|\
   |                       |        / \
  
  Hence in this setup it becomes more important to have a big enough
  kernel send buffer.
  
  With lingerd, a big enough kernel send buffer, and keep-alives off,
  the job of spoonfeeding the data to a slow client is done by the OS
  kernel in the background. As a result, C<lingerd> makes it possible to
  serve the same load using considerably fewer Apache processes. This
  translates into a reduced load on the server. It can be used as an
  alternative to the proxy setups we have seen so far.
  
  For more information about C<lingerd> see:
  http://www.iagora.com/about/software/lingerd/
  
  
  
  
  =head1 When One Machine is not Enough for RDBMS Database and mod_perl
  
  Imagine a scenario where you start your business as a small service
  providing web-site.  After a while your business becomes very popular
  and at some point you understand that it has outgrown the capacity of
  your machine. Therefore you decide to upgrade your current machine
  with lots of memory, the cutting edge super expensive CPU and an
  ultra-fast hard disk. As a result the load goes back to normal but not
  for a long, as the demand for your services keeps on growing and just
  a little time after you've upgraded your machine, once again it cannot
  cope the load. Should you buy an even stronger and very expensive
  machine or start looking for another solution? Let's explore the
  possible solution for this problem.
  
  A typical web service consists of two main software components, the
  database server and the web server.
  
  A typical user-server interaction consists of accepting the query
  parameters entered into an HTML form and submitted to the web server
  by a user, converting these parameters into a database query, sending
  it to the database server, accepting the results of the executed
  query, formatting them into a nice HTML page, and sending it to a
  user's Internet browser or another application that created the
  request (e.g. WAP).
  
  This figure depicts the above description:
  
                 1                      2
    [        ] ====> [               ] ====> [                 ]
    [ Client ]       [ Apache Server ]       [ Database Server ]
    [        ] <==== [               ] <==== [                 ]
                 4                       3
  
  This schema is known as a 3-tier architecture in the computing world.
  
  A 3-tier architecture means splitting up several processes of your
  computing solution between different machines.
  
  =over
  
  =item * Tier 1
  
  The client, who will see the data on its screen and can give
  instructions to modify or process the data.  In our case, an Internet
  browser.
  
  =item * Tier 2
  
  The application server, which does the actual processing of the data
  and sends it back to the client.  In our case, a mod_perl enabled
  Apache server.
  
  =item * Tier 3
  
  The database server, which stores and retrieves all the data for the
  application server.
  
  =back
  
  We are interested only in the second and the third tiers; we don't
  specify user machine requirements, since mod_perl is all about server
  side programming. The only thing the client should be able to do is to
  render the generated HTML from the response, which any simple browser
  will do. Of course I'm not talking about the case where you return
  some heavy Java applets, but that movie is screened in another
  theater.
  
  
  =head2 Servers' Requirements
  
  Let's first understand what kind of software the web and database
  servers are, what they need to run fast and what implications they
  have on the rest of the system software.
  
  The three important machine components are the hard disk, the amount
  of RAM and the CPU type.
  
  Typically the mod_perl server is mostly RAM hungry, while the SQL
  database server mostly needs a very fast hard-disk.  Of course if your
  mod_perl process reads a lot from the disk (which is a quite
  infrequent phenomenon) you will need a fast disk too. And if your
  database server has to do a lot of sorting of big tables and do lots
  of big table joins, you will need a lot of RAM too.
  
  If we would specify average "virtual" requirements for each machine,
  that's what we'd get:
  
  An I<"ideal"> mod_perl machine:
  
    * HD:  low-end (no real IO, mostly logging)
    * RAM: the more the better
    * CPU: medium to high (according to needs)
  
  An I<"ideal"> database server machine:
  
    * HD:  high-end
    * RAM: large amounts   (big joins, sorting of many records)
           small amounts (otherwise)
    * CPU: medium to high (according to needs)
  
  
  =head2 The Problem
  
  With the database and the httpd on the same machine, you have
  conflicting interests.
  
  During peak loads, Apache will spawn more processes and use RAM that
  the database server might have been using, or that the kernel was
  using on its behalf in the form of cache.  You will starve your
  database of resources at the time when it needs those resources the
  most.
  
  Disk I/O contention is the biggest time issue.  Adding another disk
  wouldn't cut I/O times because the database is the only one who does
  I/O - since mod_perl processes have all their code loaded in
  memory. (I'm talking about code that does pure perl and SQL
  processing) so it's clear that the DB is I/O and CPU bounded (RAM only
  if there are big joins to make) and mod_perl CPU and mostly RAM
  bounded.
  
  The problem exists, but it doesn't mean that you cannot run the
  application and the web servers on the same machine.  There is a very
  high degree of parallelism in modern PC architecture.  The I/O
  hardware is helpful here.  The machine can do many things while a SCSI
  subsystem is processing a command, or the network hardware is writing
  a buffer over the wire.
  
  If a process is not runnable (that is, it is blocked waiting for I/O
  or similar), it is not using significant CPU time.  The only CPU time
  that will be required to maintain a blocked process is the time it
  takes for the operating system's scheduler to look at the process,
  decide that it is still not runnable, and move on to the next process
  in the list.  This is hardly any time at all.  If there are two
  processes and one of them is blocked on I/O and the other is CPU
  bound, the blocked process is getting 0% CPU time, the runnable
  process is getting 99.9% CPU time, and the kernel scheduler is using
  the remainder.
  
  
  =head2 The Solution
  
  Adding another machine, which allows a set-up where both the database
  and the web servers run on their own dedicated machines.
  
  =head3 Pros
  
  =over
  
  =item * Hardware Requirements
  
  That allows you to scale two requirements independently.
  
  If your httpd processes are heavily weighted with respect to RAM
  consumption, you can easily add another machine to accommodate more
  httpd processes, without changing your database machine.
  
  If your database is CPU intensive, but your httpd doesn't need much
  CPU time, you can get low end machines for the httpd and a high end
  machine with a very fast CPU for the database server.
  
  =item * Scalability
  
  Since your web server is not depending on the database server location
  any more, you can add more web servers hitting the same database
  server, using the existing infrastructure.
  
  =item * Database Security
  
  Once you have multiple web server boxes the backend database becomes a
  single point of failure so it's a good idea to shield it from direct
  Internet access, something you couldn't do when you had both servers
  on the same machine.
  
  =back
  
  
  
  =head3 Cons
  
  =over
  
  =item * Network latency
  
  A database request from a webserver to a database server running on
  the same machine uses UNIX sockets, compared to the TCP/IP sockets
  used when the client submits the query from another machine.  UNIX
  sockets are very fast since all the communications happens within the
  same box, eliminating network delays.  TCP/IP sockets communication
  totally depends on the quality and the speed of the network the two
  machines are connected with.
  
  Basically, you can have almost the same client-server speed if you
  install a very fast and dedicated network between the two machines. It
  might impose a cost of additional NICs but it's probably insignificant
  compared to the speed up you gain.
  
  But even the normal network that you have would probably fit as well,
  because the networks delays are probably much smaller than the time it
  takes to execute the query. In contrast to the previous paragraph, you
  really want to test the added overhead, since the network can be quite
  slow especially at the peak hours.
  
  How do you know what overhead is a significant one? All you have to
  measure is the average time spent in the web server and the database
  server. If any of the two numbers is at least 20 times bigger than the
  added overhead of the network you are all set.
  
  To give you some numbers -- if your query takes about 20 milliseconds
  to process and only 1 millisecond to deliver the results, it's
  good. If the delivery takes about half of the time the processing
  takes you should start considering switching to a faster and/or
  dedicated network.
  
  The consequences of a slow network can be quite bad. If the network is
  slow mod_perl processes remain open waiting for data from the database
  server and eat even more RAM as new child processes pop up to handle
  new requests. So the overall machine performance can be worse than it
  was originally when you had just a single machine for both servers.
  
  
  =back
  
  
  =head2 Three Machines Model
  
  Since we are talking about using a dedicated machine for each server,
  you might consider adding a third machine to do the proxy work; this
  will make your setup even more flexible since it will enable you to
  proxy-pass all request to not just one mod_perl running box, but to
  many of them. It will enable you to do load balancing if and when you
  need it.
  
  Generally the proxy machine can be very light when it serves just a
  little traffic and mainly proxy-passes to the mod_perl processes. Of
  course you can use this machine to serve the static content and then
  the hardware requirement will depend on the number of objects you will
  have to serve and the rate at which they are requested.
  
  
  
  
  
  =head1 Running More than One mod_perl Server on the Same Machine.
  
  Let's assume that you have two different sets of code which have
  little or nothing in common--different Perl modules, no code
  sharing. Typical numbers can be four megabytes of unshared and four
  megabytes of shared memory for each code set, plus three megabytes of
  shared basic mod_perl stuff.  Which makes each process 17MB in size
  when the two code sets are loaded. (3MB (server core shared) + 4MB
  (shared 1st code set ) + 4MB (unshared 1st code set ) + 4MB (shared
  2nd code set ) + 4MB (unshared 2nd code set ). Under this scenario:
  
     Shared_RAM_per_Child :  11MB
     Max_Process_Size     :  17MB
     Total_RAM            : 251MB
  
  We assume that four megabytes is the size of each code sets unshared
  memory. This is a pretty typical size of unshared memory, especially
  when connecting to databases, as the database connections cannot be
  shared. Databases like Oracle can take even more RAM per connection on
  top of this.
  
  Let's assume that we have 251 megabytes of RAM dedicated to the
  webserver.
  
  According to the equation developed in the section: "L<Choosing
  MaxClients|performance/Choosing_MaxClients>":
  
                      Total_RAM - Shared_RAM_per_Child
    MaxClients = ---------------------------------------
                 Max_Process_Size - Shared_RAM_per_Child
  
  
    MaxClients = (251 - 11)/(17-11) = 40
  
  We see that we can run 40 processes, using the given memory and the
  two code sets in the same server.
  
  Now consider this practical decision. Since we have recognized that
  the code sets are very distinct in nature and there is no significant
  memory sharing in place, the wise thing to do is to split the two code
  sets between two mod_perl servers (a single mod_perl server actually
  is a set of the parent process and a number of the child
  processes). So instead of running everything on one server, now we
  move the second code set onto another mod_perl server. At this point
  we are talking about a single machine.
  
  Let's look at the figures again. After the split we will have 20
  servers of eleven megabytes (4MB unshared + 7mb shared) and another 20
  more of the same kind.
  
  How much memory do we need now? From the above equation we derive:
  
    Total_RAM = MaxClients * (Max_Process_Size - Shared_RAM_per_Child)
                + Shared_RAM_per_Child
  
  And using the numbers (the total of 40 servers):
  
    Total_RAM = 2 * (20*(11-7)+7) = 174
  
  A total of 174MB of memory required. But, hey, we have 251MB of
  memory. We've got 77MB of memory freed up. If we recalculate again the
  C<MaxClients> we will see that we can run almost 60 servers:
  
    MaxClients = (251 - 7*2)/(11-7) = 59
  
  So we can run about 19 more servers using the same memory size. Almost
  30 servers for each code set instead of 20 originally. We have
  enlarged the servers pool by half without changing the machine's
  hardware.
  
  Moreover this new setup allows us to fine tune the two code sets,
  since in reality the smaller in size code base might have a higher hit
  rate, so we can benefit even more.
  
  Let's assume that based on the usage statistics we know that the first
  code set is called in 70% of requests and the other 30% are used by
  the second set. Now we assume that the first code set requires only
  5MB of RAM (3MB shared plus 2MB unshared) over the basic mod_perl
  server size, and the second set needs 11MBytes (7MB shared and 4MB
  unshared).
  
  Lets compare this new requirement with our original 50:50 setup (here
  we have assigned the same number of clients for each code set).
  
  So now the first mod_perl server running the first code set will have
  all its processes using 8MB (3MB (server shared) + 3MB (code shared) +
  2MB (code unshared), and the second 14MB (3+7+4).  Given that we have
  a 70:30 hits relation and that we have 251MB of available memory, we
  have to solve these two equations:
  
    X/Y = 7/3
  
    X*(8-6) + 6 + Y*(14-10) + 10 = 251
  
  where X is the total number of the processes the first code set can
  use and Y the second. The first equation reflect the 70:30 hits
  relation, and the second uses the equation for the total memory
  requirements for the given number of servers and the shared and
  unshared memory sizes.
  
  When we solve these equations, we find that X equals 63 and Y equals
  27. So we have a total of 90 servers -- two and a half times the
  number of servers running compared to the original setup using the
  same memory size.
  
  The hits rate optimized solution and the fact that the code sets can
  be different in their memory requirements, allowed us to run 30 more
  servers in total and gave us 33 more servers (63 versus 30) for the
  most wanted code base, relative to the simple 50:50 split as in the
  first example.
  
  Of course if you identify more than two distinct sets of code based on
  your hit rate statistics, more complicated solutions may be required.
  You could make even more splits and run three or more mod_perl
  servers.
  
  Remember that having too many running processes doesn't necessarily
  mean better performance because all of them will contend for CPU time
  slices. The more processes that are running the less CPU time each
  gets and the slower overall performance will be. Therefore after
  hitting a certain load you might want to start spreading servers over
  different machines.
  
  In addition to the obvious memory saving you gain the power to
  troubleshoot problems that occur more easily when you have different
  components running on different servers. It's quite possible that a
  small change in the server configuration to fix or improve something
  for one code set, might completely break the second code set. For
  example if you upgrade the first code set and it requires an update of
  some modules that both code bases rely on. But there is a chance that
  the second code set won't work with a new version of a module it was
  relying on.
  
  
  
  =head1 SSL functionality and a mod_perl Server
  
  If you need SSL functionality, you can get it by adding the mod_ssl or
  equivalent Apache_ssl to the light front-end server (httpd_docs) or
  the heavy back-end mod_perl server (httpd_perl). ( The configuration
  and installation instructions are located
  L<here|install/mod_perl_and_mod_ssl___openssl_>.)
  
  The question is: Is it a good idea to add mod_ssl into the back-end
  mod_perl enabled server?  Given that your internal network is secured,
  or if both the front and back end servers are running on the same
  machine and you can ensure a safe communication between the processes,
  there is no need for an encrypted traffic between them.
  
  If this is the situation you don't have to put mod_ssl into the
  already too heavy mod_perl server. You will have the external
  traffic encrypted by the front-end server, which will proxy-pass the
  unencrypted request and response data internally.
  
  Another important point is if you put the mod_ssl on the back-end, you
  have to tunnel back your images to it (i.e. have the back-end serve
  the images) defeating the whole purpose of having the front-end
  lightweight server. 
  
  You cannot serve a secure page which includes non-secured
  information. If you fetch an html page over SSL and have an
  C<E<lt>IMGE<gt>> tag that fetches the image from the non-secure
  server, the image is shown broken. This is true for any other
  non-secured objects as well. Of course if the generated response
  doesn't include any embedded objects, like images -- this is not a
  problem.
  
  Choosing the front-end machine to have an SSL functionality also
  simplifies configuration of mod_perl by eliminating VirtualHost
  duplication for SSL. mod_perl configuration files can be plenty
  difficult without the mod_ssl overhead.
  
  Also assuming that you have front-end machines under-worked anyway,
  especially if you run a high-volume web service deploying a cluster of
  machines to serve requests, you save some CPU as it's known that SSL
  connections are about 100 times more CPU intensive than non-SSL
  connections.
  
  Of course caching session keys so you don't have to set up a new
  symmetric key for every single connection, improves the situation.  If
  you use the shared memory session caching mechanism that mod_ssl
  supports, then the overhead is actually rather small except for the
  initial connection.
  
  But then on the other hand, why even bother to run a full scale
  mod_ssl in front? You might as well just choose a small tunnel/port
  forwarding application like Stunnel or one of the many other mentioned
  at http://www.openssl.org/related/apps.html .
  
  Of course if you do heavy SSL processing ideally you should really be
  offloading it to a dedicated cryptography server. But this advice can
  be misleading based on the current status of the crypto hardware.  If
  you use hardware you get extra speed now, but you're locked into a
  proprietary solution; in 6 months/one year software will have caught
  up with whatever hardware you're using and because software is easier
  to adapt you'll have more freedom to change what software you're using
  and more control of things. So the choice is in your hand.
  
  
  
  
  
  
  
  
  
  
  =cut
  
  
  
  
  
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/style.css
  
  Index: style.css
  ===================================================================
  
  body {        font-family: helvetica, arial, sans-serif; 
        color: #000000; 
        background-color: #ffffee;
        margin: 1em 4em;
       }        
  
  
  h1, h2, h3 { color: #005A9C }  
  h1, h2, h3, h4, h5, h6  { 
        font-family: helvetica, arial, sans-serif;
        }
  
  p, div, th, td {font-family: helvetica, arial, sans-serif;}
  
  ul, dl, li, dd {font-family: helvetica, arial, sans-serif;}
  
  big, small {  font-family: helvetica, arial, sans-serif;}
  
  strong, b {   font-family: helvetica, arial, sans-serif;}
  
  em, i {               font-family: helvetica, arial, sans-serif;}
  
  a:link {      font-family: helvetica, arial, sans-serif;}
  a:visited {   font-family: helvetica, arial, sans-serif; }
  a:active {    font-family: helvetica, arial, sans-serif; }
  a:hover {     font-family: helvetica, arial, sans-serif;}
  
  pre { 
        font-family: courier new, courier, monospace;
                color: #000000;
      }
  
  code { 
        font-family: courier new, courier, monospace;
      }
  
  div.notice { 
        align: center;
        padding: .25em; 
        border: dashed green;
  }
  
  div.ad { 
        padding: .25em; 
        margin: .25em;
        border: dashed red;
                color: #000000;
  }
  
  div.navbar { 
        align: center;
        padding: .25em; 
        border: thin dashed blue;
                color: blue;
  }
  
  div.navbar a {text-decoration: none;}
  
  div.toc a {text-decoration: none;}
  
  div.search { 
        align: center;
        padding: .25em; 
        border: thin dashed #eeeeee;
        background-color: #eeeeee;
                color: black;
        width: 70%
  }
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/troubleshooting.pod
  
  Index: troubleshooting.pod
  ===================================================================
  =head1 NAME
  
  Warnings and Errors Troubleshooting Index
  
  =head1 General Advice 
  
  Perl's warnings mode is immensely helpful in detecting possible
  problems.  Make sure you always turn on warnings while you are
  developing code.  See L<The Importance of
  Warnings|debug/The_Importance_of_Warnings>.
  
  Enabling C<use diagnostics;> generally helps you to determine the
  source of the problem and how to solve it.  See L<diagnostics
  pragma|debug/diagnostics_pragma> for more information.
  
  
  
  
  
  
  
  
  =head1 Building and Installation 
  
  See L<make Troubleshooting|install/make_Troubleshooting> and
  L<make test Troubleshooting|install/make_test_Troubleshooting>
  
  
  
  
  
  
  
  
  
  
  =head1 Configuration and Startup
  
  This section talks about errors reported when you attempt to start the
  server.
  
  
  =head2 SegFaults During Startup
  
  You have to build mod_perl with the same compiler as Perl was built
  with. Otherwise you may see segmentation faults and other weird things
  happen.
  
  However if this is not the situation and you still experience
  segmentation faults, please refer to the I<SUPPORT> file in the
  mod_perl source distribution and post the trace from the core file as
  explained there.
  
  =head2 libexec/libperl.so: open failed: No such file or directory
  
  If when you run the server you get the following error:
  
    libexec/libperl.so: open failed: No such file or directory
  
  it probably means that Perl was compiled with a shared
  library.  mod_perl does detect this and links the Apache executable to
  the Perl shared library (I<libperl.so>).
  
  First of all make sure you have Perl installed on the machine, and
  that you have I<libperl.so> in
  I<E<lt>perlrootE<gt>/E<lt>versionE<gt>/E<lt>architectureE<gt>/CORE>. For
  example in I</usr/local/lib/perl5/5.00503/sun4-solaris/CORE>.
  
  Then make sure that directory is included in the environment
  variable C<LD_LIBRARY_PRELOAD>.  Under normal circumstances, Apache
  should have the path configured at compile time, but this way you can
  override the library path.
  
  =head2 install_driver(Oracle) failed: Can't load '.../DBD/Oracle/Oracle.so' 
for module DBD::Oracle
  
    install_driver(Oracle) failed: Can't load
    '/usr/lib/perl5/site_perl/5.005/i386-linux/auto/DBD/Oracle/Oracle.so' 
    for module DBD::Oracle:
    libclntsh.so.8.0: cannot open shared object file: 
    No such file or directory at
    /usr/lib/perl5/5.00503/i386-linux/DynaLoader.pm line 169. 
    at (eval 27) line 3
    Perhaps a required shared
    library or dll isn't installed where expected at 
    /usr/local/apache/perl/tmp.pl line 11
  
  On BSD style filesystems C<LD_LIBRARY_PATH> is not searched for setuid
  programs (a.k.a., Apache). This isn't a problem for CGI script since
  they don't do a setuid (and are forked off), but Apache does, and
  mod_perl is in Apache. Therefore the first solution is to explicitly
  load the library from the system wide I<ldconfig> configuration file:
  
    # echo $ORACLE_HOME/lib >> /etc/ld.so.conf
    # ldconfig
  
  Another solution to this problem is to modify the resulting
  I<Makefile> ( after running C<perl Makefile.PL>) as follows:
  
  1. search for the line C<LD_RUN_PATH=>
  
  2. replace it with C<LD_RUN_PATH=(my_oracle_home)/lib>
  
  (my_oracle_home) is, of course, the home path to your oracle
  installation.  In particular, the file libclntsh.so.8.0 should exist
  in that directory.  (If you use CPAN, the build directory for
  DBD::Oracle should be in ~/.cpan/build/DBD-Oracle-1.06/ if you're
  logged in as root.)
  
  Then, just type C<make install>, and all should go well.
  
  FYI, setting C<LD_RUN_PATH> has the effect of hard-coding the path to
  I<(my_oracle_home)/lib> in the resulting C<Oracle.so> file generated
  by the C<DBD::Oracle> so that at run-time, it doesn't have to go
  searching through C<LD_LIBRARY_PATH> or the default directories used
  by C<ld>.
  
  For more information see the I<ld> manpage and an essay on
  C<LD_LIBRARY_PATH>: http://www.visi.com/~barr/ldpath.html
  
  
  
  =head2 Invalid command 'PerlHandler'...
  
    Syntax error on line 393 of /etc/httpd/conf/httpd.conf: Invalid
    command 'PerlHandler', perhaps mis-spelled or defined by a module
    not included in the server configuration [FAILED]
  
  This can happen when you have a mod_perl enabled Apache compiled with 
  DSO (generally it's an installed RPM or other binary package) but the
  mod_perl module isn't loaded.  In this case you have to tell Apache 
  to load mod_perl by adding:
  
    AddModule mod_perl.c
  
  in your I<httpd.conf>.
  
  This can also happen when you try to run a non-mod_perl Apache server 
  using the configuration from a mod_perl server.
  
  
  
  
  
  =head2 RegistryLoader: Translation of uri [...] to filename failed
  
    RegistryLoader: Translation of uri [/home/httpd/perl/test.pl] to filename 
                  failed [tried: /home/httpd/docs/home/httpd/perl/test.pl]
  
  This error shows up when C<Apache::RegistryLoader> fails to translate
  the URI into the corresponding filesystem path. Most failures happen
  when one passes a file path instead of URI. (A reminder:
  I</home/httpd/perl/test.pl> is a file path, while I</perl/test.pl> is
  a URI). In most cases all you have to do is to pass something that
  C<Apache::RegistryLoader> expects to get - the URI, but there are more
  complex cases. C<Apache::RegistryLoader>'s man page shows how to
  handle these cases as well (look for the trans() sub).
  
  
  
  
  
  
  
  =head2 "Apache.pm failed to load!"
  
  If your server startup fails with:
  
    Apache.pm failed to load!
  
  try adding this to I<httpd.conf>:
  
    PerlModule Apache
  
  
  
  
  
  
  
  
  
  
  
  =head1 Code Parsing and Compilation
  
  
  
  
  
  
  
  
  =head2 Value of $x will not stay shared at - line 5
  
  L<my() Scoped Variable in Nested 
  Subroutines|perl/my_Scoped_Variable_in_Nested_S>.
  
  
  
  
  
  
  
  
  =head2 Value of $x may be unavailable at - line 5.
  
  L<my() Scoped Variable in Nested 
  Subroutines|perl/my_Scoped_Variable_in_Nested_S>.
  
  
  
  
  
  
  
  
  
  
  
  =head2 Can't locate loadable object for module XXX
  
  There is no object built for this module. e.g. when you see:
  
    Can't locate loadable object for module Apache::Util in @INC...
  
  make sure to give mod_perl's C<Makefile.PL> C<PERL_UTIL_API=1>,
  C<EVERYTHING=1> or C<DYNAMIC=1> parameters to enable and build all the
  components of C<Apache::Util>.
  
  
  
  
  
  
  
  =head2 Can't locate object method "get_handlers"...
  
    Can't locate object method "get_handlers" via package "Apache"
  
  You need to rebuild your mod_perl with stacked handlers, i.e.
  C<PERL_STACKED_HANDLERS=1> or more simply C<EVERYTHING=1>.
  
  
  
  
  
  
  
  =head2 Missing right bracket at line ...
  
  Most often you will find that you really do have a syntax error.
  However the other reason might be that a script running under
  C<Apache::Registry> is using C<__DATA__> or C<__END__> tokens. L<Learn
  why|porting/_END_and_DATA_tokens>.
  
  
  
  
  
  =head2  Can't load '.../auto/DBI/DBI.so' for module DBI
  
  Check that all your modules are compiled with the same Perl that is
  compiled into mod_perl.  Perl 5.005 and 5.004 are not binary
  compatible by default.
  
  Other known causes of this problem:
  
  OS distributions that ship with a broken binary Perl installation.
  
  The `perl' program and `libperl.a' library are somehow built with
  different binary compatibility flags.
  
  The solution to these problems is to rebuild Perl and any extension
  modules from a fresh source tree.  Tip for running Perl's Configure
  script: use the `-des' flags to accepts defaults and `-D' flag to
  override certain attributes:
  
    % ./Configure -des -Dcc=gcc ... && make test && make install
  
  Read Perl's INSTALL document for more details.
  
  Solaris OS specific:
  
  "C<Can't load DBI>" or similar error for the IO module or whatever
  dynamic module mod_perl tries to pull in first.  The solution is to
  re-configure, re-build and re-install Perl and dynamic modules with
  the following flags when Configure asks for "C<additional LD flags>":
  
    -Xlinker --export-dynamic
  
  or
  
    -Xlinker -E
  
  This problem is only known to be caused by installing gnu ld under Solaris.
  
  
  
  
  
  
  
  
  
  
  
  
  =head1 Runtime
  
  =head2 "exit signal Segmentation fault (11)" with mysql
  
  If you build mod_perl and php in the same binary, you might get
  Segmentation fault followed by this error:
  
    exit signal Segmentation fault (11)
  
  Solution: re-compile PHP without the built-in MySQL support (you can
  still connect to MySQL).
  
  
  
  =head2 foo ... at /dev/null line 0
  
  Under mod_perl you may receive a warning or an error in the
  I<error_log> which specifies I</dev/null> as the source file, and line
  0 as an line number where the printing of the message was triggered.
  This is absolutely normal if the code is executed from within a
  handler, because there is no actual file associated with the handler.
  Therefore C<$0> is set to I</dev/null> and that's what you see.
  
  
  =head2 Preventing mod_perl Processes From Going Wild
  
  See the sections "L<Non-Scheduled Emergency Log
  Rotation|control/Non_Scheduled_Emergency_Log_Rota>" and "L<All RAM
  Consumed|control/All_RAM_Consumed>"
  
  
  =head2 Segfaults when using XML::Parser
  
  If you have some of the processes segfault when using C<XML::Parser>
  you should use
  
    --disable-rule=EXPAT
  
  during the Apache configuration step.
  
  Starting from mod_perl version 1.23 this option is disabled by default.
  
  
  =head2 My CGI/Perl Code Gets Returned as Plain Text Instead of Being Executed 
by the Webserver
  
  See L<My CGI/Perl Code Gets Returned as Plain Text Instead of Being
  Executed by the Webserver|config/My_CGI_Perl_Code_Gets_Returned_a>.
  
  
  
  =head2 Incorrect line number reporting in error/warn log messages
  
  See L<Use of uninitialized value at (eval 80) line
  12.|troubleshooting/Use_of_uninitialized_value_at_e>
  
  
  
  =head2 rwrite returned -1
  
  This message happens when the client breaks the connection while your
  script is trying to write to the client.  With Apache 1.3.x, you should
  only see the rwrite messages if C<LogLevel> is set to C<debug>.
  
  There was a bug that reported this debug message regardless of the 
  value of the C<LogLevel> directive.  It was fixed in mod_perl 1.19_01 
  (L<CVS version|download/mod_perl>).
  
  Generally C<LogLevel> is either C<debug> or C<info>.  C<debug> logs
  everything, C<info> is the next level, which doesn't include debug
  messages.  You shouldn't use "debug" mode on your production server.
  At the moment there is no way to prevent users from aborting
  connections.
  
  
  
  =head2 Can't upgrade that kind of scalar ...
  
  Fixed in mod_perl 1.23.
  
  
  
  
  =head2 caught SIGPIPE in process
  
  
    [modperl] caught SIGPIPE in process 1234
    [modperl] process 1234 going to Apache::exit with status...
  
  That's the C<$SIG{PIPE}> handler installed by mod_perl/C<Apache::SIG>,
  which is called if a connection times out or if the client presses the
  'Stop' button.  It gives you an opportunity to do cleanups if the
  script was aborted in the middle of its execution.  See L<Handling the
  'User pressed Stop button' case|debug/Handling_the_User_pressed_Stop_>
  for more info.
  
  If your mod_perl version is earlier than 1.17 you might also get the 
  message in the following section...
  
  =head2 Client hit STOP or Netscape bit it!
  
    Client hit STOP or Netscape bit it!
    Process 2493 going to Apache::exit with status=-2
  
  You may see this message in mod_perl versions less than 1.17.  See also
  L<caught SIGPIPE in process|troubleshooting/caught_SIGPIPE_in_process>.
  
  
  
  =head2 Global symbol "$foo" requires explicit package name
  
  The script below will print a warning like that above, moreover it will
  print the whole script as a part of the warning message:
  
    #!/usr/bin/perl -w
    use strict;
    print "Content-type: text/html\n\n";
    print "Hello $undefined";
  
  The warning:
  
    Global symbol "$undefined" requires 
    explicit package name at /usr/apps/foo/cgi/tmp.pl line 4.
            eval 'package Apache::ROOT::perl::tmp_2epl;
    use Apache qw(exit);sub handler {
    #line 1 /usr/apps/foo/cgi/tmp.pl
    BEGIN {$^W = 1;}#!/usr/bin/perl -w
    use strict;
    print "Content-type: text/html\\n\\n";
    print "Hello $undefined";
    
    
    }
    ;' called at 
    /usr/apps/lib/perl5/site_perl/5.005/aix/Apache/Registry.pm 
    line 168
            Apache::Registry::compile('package
        Apache::ROOT::perl::tmp_2epl;use Apache qw(exit);sub han...') 
        called at 
          /usr/apps/lib/perl5/site_perl/5.005/aix/Apache/Registry.pm 
          line 121
          Apache::Registry::handler('Apache=SCALAR(0x205026c0)') 
          called at /usr/apps/foo/cgi/tmp.pl line 4
            eval {...} called at /usr/apps/foo/cgi/tmp.pl line 4
    [Sun Nov 15 15:15:30 1998] [error] Undefined subroutine 
    &Apache::ROOT::perl::tmp_2epl::handler called at 
    /usr/apps/lib/perl5/site_perl/5.005/aix/Apache/Registry.pm 
    line 135.
    
    [Sun Nov 15 15:15:30 1998] [error] Goto undefined subroutine 
    &Apache::Constants::SERVER_ERROR at 
    /usr/apps/lib/perl5/site_perl/5.005/aix/Apache/Constants.pm 
    line 23.
  
  The error is simple to fix.  When you use the C<use strict;> pragma
  (and you should...), Perl will insist that all variables are defined
  before being used, so the error will not arise.
  
  The bad thing is that sometimes the whole script (possibly, thousands
  of lines) is printed to the I<error_log> file as code that the server
  has tried to C<eval()>uate.
  
  May be you have a C<$SIG{__DIE__}> handler installed
  (C<Carp::confess()>?).  If so that's what's expected.
  
  You might wish to try something more terse such as S<"local
  $SIG{__WARN__} = \&Carp::cluck;"> The confess method is I<very>
  verbose and will tell you more than you might wish to know including
  full source.
  
  
  
  
  
  =head2 Use of uninitialized value at (eval 80) line 12.
  
  Your code includes some undefined variable that you have used as if it
  was already defined and initialized.  For example:
  
    $param = $q->param('test');
    print $param;
  
  vs.
  
    $param = $q->param('test') || '';
    print $param;
  
  In the second case, C<$param> will always be I<defined>, either with
  C<$q-E<gt>param('test')>'s return value or the default value (C<''>
  empty string in our example).
  
  Also read about L<Finding the Line Which Triggered the Error or
  Warning|debug/Finding_the_Line_Which_Triggered>.
  
  =head2 Undefined subroutine &Apache::ROOT::perl::test_2epl::some_function 
called at
  
  See L<Names collisions with Modules and
  libs|porting/Name_collisions_with_Modules_and>.
  
  
  
  =head2 Callback called exit
  
  I<Callback called exit> is just a generic message when some
  unrecoverable error occurs inside Perl during C<perl_call_sv()> (which
  mod_perl uses to invoke all handler subroutines.  Such problems seem
  to occur far less with 5.005_03 than 5.004.
  
  Sometimes you discover that your server is not responding and its
  error_log has filled up the remaining space on the file system. When
  you get to see the contents of the error_log -- it includes millions
  of lines, like:
  
    Callback called exit at -e line 33, <HTML> chunk 1.
  
  Why the looping?
  
  Perl can get I<very> confused inside an infinite loop in your code.  
  It doesn't necessarily mean that your code did call C<exit()>.  Perl's 
  malloc went haywire and called C<croak()>, but no memory is left to 
  properly report the error, so Perl is stuck in a loop writing that 
  same message to stderr.
  
  Perl 5.005+ plus is recommended for its improved malloc.c and other
  features that improve mod_perl and are turned on by default.
  
  See also L</Out_of_memory!>
  
  =head2 Out of memory!
  
  If something goes really wrong with your code, Perl may die with an
  "Out of memory!" message and/or "Callback called exit".  Common causes of this
  are never-ending loops, deep recursion, or calling an
  undefined subroutine.  Here's one way to catch the problem: See Perl's
  INSTALL document for this item:
  
    =item -DPERL_EMERGENCY_SBRK
  
    If PERL_EMERGENCY_SBRK is defined, running out of memory need not be a
    fatal error: a memory pool can allocated by assigning to the special
    variable $^M.  See perlvar(1) for more details.
  
  If you compile with that option and add 'C<use Apache::Debug level
  =E<gt> 4;>' to your PerlScript, it will allocate the C<$^M> emergency
  pool and the C<$SIG{__DIE__}> handler will call C<Carp::confess>,
  giving you a stack trace which should reveal where the problem is.
  See the C<Apache::Resource> module for ways to control httpd processes.
  
  Note that Perl 5.005 and later have C<PERL_EMERGENCY_SBRK> turned on 
  by default.
  
  The other trick is to have a startup script initialize
  C<Carp::confess>, like so:
  
    use Carp ();
    eval { Carp::confess("init") };
  
  this way, when the real problem happens, C<Carp::confess> doesn't eat
  memory in the emergency pool (C<$^M>).
  
  
  
  
  =head2 server reached MaxClients setting, consider raising the MaxClients 
setting
  
  See L<Choosing MaxClients|performance/Choosing_MaxClients>.
  
  =head2 syntax error at /dev/null line 1, near "line arguments:"
  
    syntax error at /dev/null line 1, near "line arguments:"
    Execution of /dev/null aborted due to compilation errors.
    parse: Undefined error: 0
  
  There is a chance that your C</dev/null> device is broken. Try:
  
    % echo > /dev/null
  
  Alternatively you should try to remove this special file and recreate
  it:
  
    # rm /dev/null
    # mknod /dev/null c 1 3
    # chmod a+rw /dev/null
  
  
  =head2 Can't call method "register_cleanup" (CGI.pm)
  
    Can't call method "register_cleanup" on an
    undefined value at /usr/lib/perl5/5.00503/CGI.pm line 263.
  
  caused by this code snippet in I<CGI.pm>:
  
    if ($MOD_PERL) {
      Apache->request->register_cleanup(\&CGI::_reset_globals);
      undef $NPH;
    }
  
  One solution is to add to I<httpd.conf>:
  
    PerlPostReadRequestHandler 'sub { Apache->request(shift) }'
  
  But even better, switch to C<Apache::Cookie>:
  
    use Apache;
    use Apache::Cookie;
    
    sub handler {
      my $r = shift;
      my $cookies = Apache::Cookie->new($r)->parse;
      my %bar = $cookies->{foo}->value;
    }
  
  
  
  
  
  =head1 Shutdown and Restart
  
  
  
  =head2 Evil things might happen when using PerlFreshRestart
  
  Unfortunately, not all perl modules are robust enough to survive
  reload.  For them this is an unusual situation.  C<PerlFreshRestart>
  does not much more than:
  
    while (my($k,$v) = each %INC) {
      delete $INC{$k};
      require $k;
    }
  
  Besides that, it flushes the C<Apache::Registry> cache, and empties any
  dynamic stacked handlers (e.g. C<PerlChildInitHandler>).
  
  Lots of Segfaults and other problems were reported by users who had
  turned C<PerlFreshRestart> B<On>.  Most of them have gone away when it
  was turned off.  It doesn't mean that you shouldn't use it, if it works
  for you.  Just beware of the dragons...
  
  Note that if you have mod_perl enabled Apache built as DSO and you
  restart it, the whole Perl interpreter is completely torn down
  (perl_destruct())and restarted.  The value of C<PerlFreshRestart> is
  irrelevant at this point.
  
  =head2 Constant subroutine XXX redefined
  
  That's a mandatory warning inside Perl which happens only if you modify
  your script and Apache::Registry reloads it.  Perl is warning you that
  the subroutine(s) were redefined.  It is mostly harmless.  If you
  don't like seeing these warnings, just C<kill -USR1> (graceful restart) 
  Apache when you modify your scripts.
  
  You aren't supposed to see these warnings if you don't modify the code
  with perl 5.004_05 or 5.005+.and higher.  If you still experience a
  problem with code within a CGI script, moving all the code into a
  module (or a library) and require()ing it should solve the problem.
  
  
  
  
  =head2 Can't undef active subroutine
  
    Can't undef active subroutine at
    /usr/apps/lib/perl5/site_perl/5.005/aix/Apache/Registry.pm line 102.
    Called from package Apache::Registry, filename
    /usr/apps/lib/perl5/site_perl/5.005/aix/Apache/Registry.pm, line 102
  
  This problem is caused when a client drops the connection while httpd
  is in the middle of a write.  httpd times out, sending a SIGPIPE,
  and Perl (in that child) is stuck in the middle of its eval context.
  This is fixed by the Apache::SIG module which is called by
  default.  This should not happen unless you have code that is messing
  with B<$SIG{PIPE}>.  It's also triggered only when you've changed your
  script on disk and mod_perl is trying to reload it.
  
  
  
  
  =head2 [warn] child process 30388 did not exit, sending another SIGHUP
  
  From mod_perl.pod: With Apache versions 1.3.0 and higher, mod_perl
  will call the perl_destruct() Perl API function during the child exit
  phase.  This will cause proper execution of C<END> blocks found during
  server startup as well as invoking the C<DESTROY> method on global
  objects which are still alive.  It is possible that this operation may
  take a long time to finish, causing problems during a restart.  If
  your code does not contain any C<END> blocks or C<DESTROY> methods
  which need to be run during child server shutdown, this destruction
  can be avoided by setting the C<PERL_DESTRUCT_LEVEL> environment
  variable to C<-1>. Be aware that `your code' includes any modules you
  use and I<they> may well have C<DESTROY> and C<END> blocks...
  
  
  =head2 Processes Get Stuck on Graceful Restart
  
  If you see a process stuck in "G" (Gracefully finishing) after a doing
  a graceful restart (sending S<kill -SIGUSR1>) it means that the
  process is hanging in perl_destruct() while trying to cleanup.  This
  cleanup normally isn't a requirement, you can disable it by setting
  the PERL_DESTRUCT_LEVEL environment variable to -1.  See the section
  "L<Speeding up the Apache Termination and
  Restart|control/Speeding_up_the_Apache_Terminati>" for more 
  information.
  
  
  =head2 httpd keeps on growing after each restart
  
  See the I<HUP Signal> explanation at the section:
  L<Server Stopping and Restarting|control/Server_Stopping_and_Restarting>
  
  =head1 Windows OS specific notes
  
  
  =head2 Apache::DBI 
  
  C<Apache::DBI> causes the server to exit when it starts up, with:
  
    [Mon Oct 25 15:06:11 1999] file .\main\http_main.c, line 5890,
    assertion "start_mutex" failed
  
  Solution: build mod_perl with C<PERL_STARTUP_DONE_CHECK> set
  (e.g. insert
  
    #define PERL_STARTUP_DONE_CHECK 1
  
  at the top of I<mod_perl.h> or add it to the defines in the MSVC++ and
  similar applications' C<Options> dialog).
  
  Apache loads all Apache modules twice, to make sure the server will
  successfully restart when asked to.  This flag disables all
  C<PerlRequire> and C<PerlModule> statements on the first load, so they
  can succeed on the second load.  Without that flag, the second load
  fails.
  
  
  =cut
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/code/DB_File-Lock2.pm
  
  Index: DB_File-Lock2.pm
  ===================================================================
  package DB_File::Lock2;
  require 5.004;
  
  use strict;
  
  BEGIN {
      # RCS/CVS compliant:  must be all one line, for MakeMaker
    $DB_File::Lock2::VERSION = do { my @r = (q$Revision: 1.1 $ =~ /\d+/g); 
sprintf "%d."."%02d" x $#r, @r };
  
  }
  
  use DB_File ();
  use Fcntl qw(:flock O_RDWR O_CREAT);
  use Carp qw(croak carp verbose);
  use Symbol ();
  
  @DB_File::Lock2::ISA    = qw( DB_File );
  %DB_File::Lock2::lockfhs = ();
  
  use constant DEBUG => 0;
  
    # file creation permissions mode
  use constant PERM_MODE => 0660;
  
    # file locking modes
  %DB_File::Lock2::locks =
    (
     read  => LOCK_SH,
     write => LOCK_EX,
    );
  
  # SYNOPSIS:
  # tie my %mydb, 'DB_File::Lock2', $filepath, 
  #     ['read' || 'write', 'HASH' || 'BTREE']
  # while (my($k,$v) = each %mydb) {
  #   print "$k => $v\n";
  # }
  # untie %mydb;
  #########
  sub TIEHASH {
    my $class     = shift;
    my $file      = shift;
    my $lock_mode = lc shift || 'read';
    my $db_type   = shift || 'HASH';
  
    die "Dunno about lock mode: [$lock_mode].\n
         Valid modes are 'read' or 'write'.\n"
      unless $lock_mode eq 'read' or $lock_mode eq 'write';
  
    # Critical section starts here if in write mode!
  
      # create an external lock
    my $lockfh = Symbol::gensym();
    open $lockfh, ">$file.lock" or die "Cannot open $file.lock for writing: 
$!\n";
    unless (flock $lockfh, $DB_File::Lock2::locks{$lock_mode}) {
      croak "cannot flock: $lock_mode => $DB_File::Lock2::locks{$lock_mode}: 
$!\n";
    }
  
    my $self = $class->SUPER::TIEHASH
      ($file,
       O_RDWR|O_CREAT,
       PERM_MODE,
       ($db_type eq 'BTREE' ? $DB_File::DB_BTREE : $DB_File::DB_HASH )
      );
  
      # remove the package name in case re-blessing occurs
    (my $id = "$self") =~ s/^[^=]+=//;
  
      # cache the lock fh
    $DB_File::Lock2::lockfhs{$id} = $lockfh;
  
    return $self;
  
  } # end of sub new
  
  
  # DESTROY is automatically called when a tied variable
  # goes out of scope, on explicit untie() or when the program is
  # interrupted, e.g. with a die() call.
  # 
  # It unties the db by forwarding it to the parent class,
  # unlocks the file and removes it from the cache of locks.
  ###########
  sub DESTROY{
    my $self = shift;
  
    $self->SUPER::DESTROY(@_);
  
      # now it safe to unlock the file, (close() unlocks as well). Since
      # the object has gone we remove its lock filehandler entry
      # from the cache.
    (my $id = "$self") =~ s/^[^=]+=//; # see 'sub TIEHASH'
    close delete $DB_File::Lock2::lockfhs{$id};
  
      # Critical section ends here if in write mode!
  
    print "Destroying ".__PACKAGE__."\n" if DEBUG;
  
  }
  
  ####
  END {
    print "Calling the END from ".__PACKAGE__."\n" if DEBUG;
  
  }
  
  1;
  
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/code/My-DB.pm
  
  Index: My-DB.pm
  ===================================================================
  package My::DB;
  
  use strict;
  use 5.004;
  
  use DBI;
  
  use vars qw(%c);
  use constant DEBUG => 0;
  
  %c =
    (
     db => {
          DB_NAME      => 'foo',
          SERVER       => 'localhost',
          USER         => 'put_username_here',
          USER_PASSWD  => 'put_passwd_here',
         },
  
    );
  
  use Carp qw(croak verbose);
  #local $SIG{__WARN__} = \&Carp::cluck;
  
  # untaint the path by explicit setting
  local $ENV{PATH} = '/bin:/usr/bin';
  
  #######
  sub new {
    my $proto = shift;
    my $class = ref($proto) || $proto;
    my $self  = {};
  
      # connect to the DB, Apache::DBI takes care of caching the connections
      # save into a dbh - Database handle object
    $self->{dbh} = DBI->connect("DBI:mysql:$c{db}{DB_NAME}::$c{db}{SERVER}",
                               $c{db}{USER},
                               $c{db}{USER_PASSWD},
                               {
                                PrintError => 1, # warn() on errors
                                RaiseError => 0, # don't die on error
                                AutoCommit => 1, # commit executes immediately
                               }
                              )
      or die "Cannot connect to database: $DBI::errstr";
  
      # we want to die on errors if in debug mode
    $self->{dbh}->{RaiseError} = 1 if DEBUG;
  
      # init the sth - Statement handle object
    $self->{sth} = '';
  
    bless ($self, $class);
  
    $self;
  
  } # end of sub new
  
  
  
  ######################################################################
                 ###################################
                 ###                             ###
                 ###       SQL Functions         ###
                 ###                             ###
                 ###################################
  ######################################################################
  
  # print debug messages
  sub d{
     # we want to print the trace in debug mode
    print "<DT><B>".join("<BR>", @_)."</B>\n" if DEBUG;
  
  } # end of sub d
  
  
  ######################################################################
  # return a count of matched rows, by conditions 
  #
  #  $count = sql_count_matched($table_name,[EMAIL PROTECTED],[EMAIL 
PROTECTED]);
  #
  # conditions must be an array so we can pass more than one column with
  # the same name.
  #
  #  @conditions =  ( column => ['comp_sign','value'],
  #                  foo    => ['>',15],
  #                  foo    => ['<',30],
  #                );
  #
  # The sub knows automatically to detect and quote strings
  #
  # Restrictions are the list of restrictions like ('order by email')
  #
  ##########################
  sub sql_count_matched{
    my $self    = shift;
    my $table   = shift || '';
    my $r_conds = shift || [];
    my $r_restr = shift || [];
  
      # we want to print the trace in debug mode
    d( "[".(caller(2))[3]." - ".(caller(1))[3]." - ". (caller(0))[3]."]") if 
DEBUG;
  
      # build the query
    my $do_sql = "SELECT COUNT(*) FROM $table ";
    my @where = ();
    for(my $i=0;$i<@{$r_conds};$i=$i+2) {
      push @where, join " ",
        $$r_conds[$i],
        $$r_conds[$i+1][0],
        sql_quote(sql_escape($$r_conds[$i+1][1]));
    }
      # Add the where clause if we have one
    $do_sql .= "WHERE ". join " AND ", @where if @where;
  
      # restrictions (DONT put commas!)
    $do_sql .= " ". join " ", @{$r_restr} if @{$r_restr};
  
    d("SQL: $do_sql") if DEBUG;
  
      # do query
    $self->{sth} = $self->{dbh}->prepare($do_sql);
    $self->{sth}->execute();
    my ($count) = $self->{sth}->fetchrow_array;
  
    d("Result: $count") if DEBUG;
  
    $self->{sth}->finish;
  
    return $count;
  
  } # end of sub sql_count_matched
  
  
  ######################################################################
  # return a count of matched distinct rows, by conditions 
  #
  #  $count = sql_count_matched_distinct($table_name,[EMAIL PROTECTED],[EMAIL 
PROTECTED]);
  #
  # conditions must be an array so we can path more than one column with
  # the same name.
  #
  #  @conditions =  ( column => ['comp_sign','value'],
  #                  foo    => ['>',15],
  #                  foo    => ['<',30],
  #                );
  #
  # The sub knows automatically to detect and quote strings
  #
  # Restrictions are the list of restrictions like ('order by email')
  #
  # This a slow implementation - because it cannot use select(*), but
  # brings all the records in first and then counts them. In the next
  # version of mysql there will be an operator 'select (distinct *)'
  # which will make things much faster, so we will just change the
  # internals of this sub, without changing the code itself.
  #
  ##############################
  sub sql_count_matched_distinct{
    my $self    = shift;
    my $table   = shift || '';
    my $r_conds = shift || [];
    my $r_restr = shift || [];
  
      # we want to print the trace in debug mode
    d( "[".(caller(2))[3]." - ".(caller(1))[3]." - ". (caller(0))[3]."]") if 
DEBUG;
  
      # build the query
    my $do_sql = "SELECT DISTINCT * FROM $table ";
    my @where = ();
    for(my $i=0;$i<@{$r_conds};$i=$i+2) {
      push @where, join " ",
        $$r_conds[$i],
        $$r_conds[$i+1][0],
        sql_quote(sql_escape($$r_conds[$i+1][1]));
    }
      # Add the where clause if we have one
    $do_sql .= "WHERE ". join " AND ", @where if @where;
  
      # restrictions (DONT put commas!)
    $do_sql .= " ". join " ", @{$r_restr} if @{$r_restr};
  
    d("SQL: $do_sql") if DEBUG;
  
      # do query
  #  $self->{sth} = $self->{dbh}->prepare($do_sql);
  #  $self->{sth}->execute();
  
    my $count = @{$self->{dbh}->selectall_arrayref($do_sql)};
  
  #  my ($count) = $self->{sth}->fetchrow_array;
  
    d("Result: $count") if DEBUG;
  
  #  $self->{sth}->finish;
  
    return $count;
  
  } # end of sub sql_count_matched_distinct
  
  
  
  ######################################################################
  # return a single (first) matched value or undef, by conditions and
  # restrictions
  #
  # sql_get_matched_value($table_name,$column,[EMAIL PROTECTED],[EMAIL 
PROTECTED]);
  #
  # column is a name of the column
  #
  # conditions must be an array so we can path more than one column with
  # the same name.
  #  @conditions =  ( column => ['comp_sign','value'],
  #                  foo    => ['>',15],
  #                  foo    => ['<',30],
  #                );
  # The sub knows automatically to detect and quote strings
  #
  # restrictions is a list of restrictions like ('order by email')
  #
  ##########################
  sub sql_get_matched_value{
    my $self    = shift;
    my $table   = shift || '';
    my $column  = shift || '';
    my $r_conds = shift || [];
    my $r_restr = shift || [];
  
      # we want to print in the trace debug mode
    d( "[".(caller(2))[3]." - ".(caller(1))[3]." - ". (caller(0))[3]."]") if 
DEBUG;
  
      # build the query
    my $do_sql = "SELECT $column FROM $table ";
  
    my @where = ();
    for(my $i=0;$i<@{$r_conds};$i=$i+2) {
      push @where, join " ",
        $$r_conds[$i],
        $$r_conds[$i+1][0],
        sql_quote(sql_escape($$r_conds[$i+1][1]));
    }
      # Add the where clause if we have one
    $do_sql .= " WHERE ". join " AND ", @where if @where;
  
      # restrictions (DONT put commas!)
    $do_sql .= " ". join " ", @{$r_restr} if @{$r_restr};
  
    d("SQL: $do_sql") if DEBUG;
  
      # do query
    return $self->{dbh}->selectrow_array($do_sql);
  
  } # end of sub sql_get_matched_value
  
  
  
  
  ######################################################################
  # return a single row of first matched rows, by conditions and
  # restrictions. The row is being inserted into @results_row array
  # (value1,value2,...)  or empty () if none matched
  #
  # sql_get_matched_row([EMAIL PROTECTED],$table_name,[EMAIL PROTECTED],[EMAIL 
PROTECTED],[EMAIL PROTECTED]);
  #
  # columns is a list of columns to be returned (username, fname,...)
  #
  # conditions must be an array so we can path more than one column with
  # the same name.
  #  @conditions =  ( column => ['comp_sign','value'],
  #                  foo    => ['>',15],
  #                  foo    => ['<',30],
  #                );
  # The sub knows automatically to detect and quote strings
  #
  # restrictions is a list of restrictions like ('order by email')
  #
  ##########################
  sub sql_get_matched_row{
    my $self    = shift;
    my $r_row   = shift || {};
    my $table   = shift || '';
    my $r_cols  = shift || [];
    my $r_conds = shift || [];
    my $r_restr = shift || [];
  
      # we want to print in the trace debug mode
    d( "[".(caller(2))[3]." - ".(caller(1))[3]." - ". (caller(0))[3]."]") if 
DEBUG;
  
      # build the query
    my $do_sql = "SELECT ";
    $do_sql .= join ",", @{$r_cols} if @{$r_cols};
    $do_sql .= " FROM $table ";
  
    my @where = ();
    for(my $i=0;$i<@{$r_conds};$i=$i+2) {
      push @where, join " ",
        $$r_conds[$i],
        $$r_conds[$i+1][0],
        sql_quote(sql_escape($$r_conds[$i+1][1]));
    }
      # Add the where clause if we have one
    $do_sql .= " WHERE ". join " AND ", @where if @where;
  
      # restrictions (DONT put commas!)
    $do_sql .= " ". join " ", @{$r_restr} if @{$r_restr};
  
    d("SQL: $do_sql") if DEBUG;
  
      # do query
    @{$r_row} = $self->{dbh}->selectrow_array($do_sql);
  
  } # end of sub sql_get_matched_row
  
  
  
  ######################################################################
  # return a ref to hash of single matched row, by conditions
  # and restrictions. return undef if nothing matched.
  # (column1 => value1, column2 => value2) or empty () if non matched
  #
  # sql_get_hash_ref($table_name,[EMAIL PROTECTED],[EMAIL PROTECTED],[EMAIL 
PROTECTED]);
  #
  # columns is a list of columns to be returned (username, fname,...)
  #
  # conditions must be an array so we can path more than one column with
  # the same name.
  #  @conditions =  ( column => ['comp_sign','value'],
  #                  foo    => ['>',15],
  #                  foo    => ['<',30],
  #                );
  # The sub knows automatically to detect and quote strings
  #
  # restrictions is a list of restrictions like ('order by email')
  #
  ##########################
  sub sql_get_hash_ref{
    my $self    = shift;
    my $table   = shift || '';
    my $r_cols  = shift || [];
    my $r_conds = shift || [];
    my $r_restr = shift || [];
  
      # we want to print in the trace debug mode
    d( "[".(caller(2))[3]." - ".(caller(1))[3]." - ". (caller(0))[3]."]") if 
DEBUG;
  
      # build the query
    my $do_sql = "SELECT ";
    $do_sql .= join ",", @{$r_cols} if @{$r_cols};
    $do_sql .= " FROM $table ";
  
    my @where = ();
    for(my $i=0;$i<@{$r_conds};$i=$i+2) {
      push @where, join " ",
        $$r_conds[$i],
        $$r_conds[$i+1][0],
        sql_quote(sql_escape($$r_conds[$i+1][1]));
    }
      # Add the where clause if we have one
    $do_sql .= " WHERE ". join " AND ", @where if @where;
  
      # restrictions (DONT put commas!)
    $do_sql .= " ". join " ", @{$r_restr} if @{$r_restr};
  
    d("SQL: $do_sql") if DEBUG;
  
      # do query
    $self->{sth} = $self->{dbh}->prepare($do_sql);
    $self->{sth}->execute();
  
    return $self->{sth}->fetchrow_hashref;
  
  } # end of sub sql_get_hash_ref
  
  
  
  
  
  ######################################################################
  # returns a reference to an array, matched by conditions and
  # restrictions, which contains one reference to array per row. If
  # there are no rows to return, returns a reference to an empty array:
  # [
  #  [array1],
  #   ......
  #  [arrayN],
  # ];
  #
  # $ref = sql_get_matched_rows_ary_ref($table_name,[EMAIL PROTECTED],[EMAIL 
PROTECTED],[EMAIL PROTECTED]);
  #
  # columns is a list of columns to be returned (username, fname,...)
  #
  # conditions must be an array so we can path more than one column with
  # the same name. @conditions are being cancatenated with AND
  #  @conditions =  ( column => ['comp_sign','value'],
  #                  foo    => ['>',15],
  #                  foo    => ['<',30],
  #                );
  # results in
  # WHERE foo > 15 AND foo < 30
  #
  #  to make an OR logic use (then ANDed )
  #  @conditions =  ( column => ['comp_sign',['value1','value2']],
  #                  foo    => ['=',[15,24] ],
  #                  bar    => ['=',[16,21] ],
  #                );
  # results in
  # WHERE (foo = 15 OR foo = 24) AND (bar = 16 OR bar = 21)
  #
  # The sub knows automatically to detect and quote strings
  #
  # restrictions is a list of restrictions like ('order by email')
  #
  ##########################
  sub sql_get_matched_rows_ary_ref{
    my $self    = shift;
    my $table   = shift || '';
    my $r_cols  = shift || [];
    my $r_conds = shift || [];
    my $r_restr = shift || [];
  
      # we want to print in the trace debug mode
    d( "[".(caller(2))[3]." - ".(caller(1))[3]." - ". (caller(0))[3]."]") if 
DEBUG;
  
      # build the query
    my $do_sql = "SELECT ";
    $do_sql .= join ",", @{$r_cols} if @{$r_cols};
    $do_sql .= " FROM $table ";
  
    my @where = ();
    for(my $i=0;$i<@{$r_conds};$i=$i+2) {
  
      if (ref $$r_conds[$i+1][1] eq 'ARRAY') {
          # multi condition for the same field/comparator to be ORed
        push @where, map {"($_)"} join " OR ",
        map { join " ", 
                $r_conds->[$i],
                $r_conds->[$i+1][0],
                sql_quote(sql_escape($_));
            } @{$r_conds->[$i+1][1]};
      } else {
          # single condition for the same field/comparator
        push @where, join " ",
        $r_conds->[$i],
          $r_conds->[$i+1][0],
          sql_quote(sql_escape($r_conds->[$i+1][1]));
      }
    } # end of for(my $i=0;$i<@{$r_conds};$i=$i+2
  
      # Add the where clause if we have one
    $do_sql .= " WHERE ". join " AND ", @where if @where;
  
      # restrictions (DONT put commas!)
    $do_sql .= " ". join " ", @{$r_restr} if @{$r_restr};
  
    d("SQL: $do_sql") if DEBUG;
  
      # do query
    return $self->{dbh}->selectall_arrayref($do_sql);
  
  } # end of sub sql_get_matched_rows_ary_ref
  
  
  
  
  ######################################################################
  # insert a single row into a DB
  #
  #  sql_insert_row($table_name,\%data,$delayed);
  #
  # data is hash of type (column1 => value1 ,column2 => value2 , )
  #
  # $delayed: 1 => do delayed insert, 0 or none passed => immediate
  #
  # * The sub knows automatically to detect and quote strings 
  #
  # * The insert id delayed, so the user will not wait untill the insert
  # will be completed, if many select queries are running 
  #
  ##########################
  sub sql_insert_row{
    my $self    = shift;
    my $table   = shift || '';
    my $r_data = shift || {};
    my $delayed = (shift) ? 'DELAYED' : '';
  
      # we want to print in the trace debug mode
    d( "[".(caller(2))[3]." - ".(caller(1))[3]." - ". (caller(0))[3]."]") if 
DEBUG;
  
      # build the query
    my $do_sql = "INSERT $delayed INTO $table ";
    $do_sql   .= "(".join(",",keys %{$r_data}).")";
    $do_sql   .= " VALUES (";
    $do_sql   .= join ",", sql_quote(sql_escape( values %{$r_data} ) );
    $do_sql   .= ")";
  
    d("SQL: $do_sql") if DEBUG;
  
      # do query
    $self->{sth} = $self->{dbh}->prepare($do_sql);
    $self->{sth}->execute();
  
  } # end of sub sql_insert_row
  
  
  ######################################################################
  # update rows in a DB by condition
  #
  #  sql_update_rows($table_name,\%data,[EMAIL PROTECTED],$delayed);
  #
  # data is hash of type (column1 => value1 ,column2 => value2 , )
  #
  # conditions must be an array so we can path more than one column with
  # the same name.
  #  @conditions =  ( column => ['comp_sign','value'],
  #                  foo    => ['>',15],
  #                  foo    => ['<',30],
  #                ); 
  #
  # $delayed: 1 => do delayed insert, 0 or none passed => immediate
  #
  # * The sub knows automatically to detect and quote strings 
  #
  #
  ##########################
  sub sql_update_rows{
    my $self    = shift;
    my $table   = shift || '';
    my $r_data = shift || {};
    my $r_conds = shift || [];
    my $delayed = (shift) ? 'LOW_PRIORITY' : '';
  
      # we want to print in the trace debug mode
    d( "[".(caller(2))[3]." - ".(caller(1))[3]." - ". (caller(0))[3]."]") if 
DEBUG;
  
      # build the query
    my $do_sql = "UPDATE $delayed $table SET ";
    $do_sql   .= join ",", 
      map { "$_=".join "",sql_quote(sql_escape($$r_data{$_})) } keys %{$r_data};
  
    my @where = ();
    for(my $i=0;$i<@{$r_conds};$i=$i+2) {
      push @where, join " ",
        $$r_conds[$i],
        $$r_conds[$i+1][0],
        sql_quote(sql_escape($$r_conds[$i+1][1]));
    }
      # Add the where clause if we have one
    $do_sql .= " WHERE ". join " AND ", @where if @where;
  
  
    d("SQL: $do_sql") if DEBUG;
  
      # do query
    $self->{sth} = $self->{dbh}->prepare($do_sql);
  
    $self->{sth}->execute();
  
  #  my ($count) = $self->{sth}->fetchrow_array;
  #
  #  d("Result: $count") if DEBUG;
  
  } # end of sub sql_update_rows
  
  
  ######################################################################
  # delete rows from DB by condition
  #
  # sql_delete_rows($table_name,[EMAIL PROTECTED]);
  #
  # conditions must be an array so we can path more than one column with
  # the same name.
  #  @conditions =  ( column => ['comp_sign','value'],
  #                  foo    => ['>',15],
  #                  foo    => ['<',30],
  #                );
  #
  # * The sub knows automatically to detect and quote strings 
  #
  #
  ##########################
  sub sql_delete_rows{
    my $self    = shift;
    my $table   = shift || '';
    my $r_conds = shift || [];
  
      # we want to print in the trace debug mode
    d( "[".(caller(2))[3]." - ".(caller(1))[3]." - ". (caller(0))[3]."]") if 
DEBUG;
  
      # build the query
    my $do_sql = "DELETE FROM $table ";
  
    my @where = ();
    for(my $i=0;$i<@{$r_conds};$i=$i+2) {
      push @where, join " ",
        $$r_conds[$i],
        $$r_conds[$i+1][0],
        sql_quote(sql_escape($$r_conds[$i+1][1]));
    }
  
      # Must be very careful with deletes, imagine somehow @where is
      # not getting set, "DELETE FROM NAME" deletes the contents of the table
    warn("Attempt to delete a whole table $table from DB\n!!!"),return unless 
@where;
  
      # Add the where clause if we have one
    $do_sql .= " WHERE ". join " AND ", @where;
  
    d("SQL: $do_sql") if DEBUG;
  
      # do query
    $self->{sth} = $self->{dbh}->prepare($do_sql);
    $self->{sth}->execute();
  
  } # end of sub sql_delete_rows
  
  
  ######################################################################
  # executes the passed query and returns a reference to an array which
  # contains one reference per row. If there are no rows to return,
  # returns a reference to an empty array.
  #
  # $r_array = sql_execute_and_get_r_array($query);
  #
  #
  ##########################
  sub sql_execute_and_get_r_array{
    my $self     = shift;
    my $do_sql   = shift || '';
  
      # we want to print in the trace debug mode
    d( "[".(caller(2))[3]." - ".(caller(1))[3]." - ". (caller(0))[3]."]") if 
DEBUG;
  
    d("SQL: $do_sql") if DEBUG;
  
    $self->{dbh}->selectall_arrayref($do_sql);
  
  } # end of sub sql_execute_and_get_r_array
  
  
  ######################################################################
  # lock the passed tables in the requested mode (READ|WRITE) and set
  # internal flag to handle possible user abortions, so the tables will
  # be unlocked thru the END{} block
  #
  # sql_lock_tables('table1','lockmode',..,'tableN','lockmode'
  # lockmode = (READ | WRITE)
  #
  # _side_effect_ $self->{lock} = 'On';
  #
  ##########################
  sub sql_lock_tables{
    my $self   = shift;
    my %modes = @_;
  
    return unless %modes;
  
    my $do_sql = 'LOCK TABLES ';
    $do_sql .= join ",", map {"$_ $modes{$_}"} keys %modes;
  
      # we want to print the trace in debug mode
    d( "[".(caller(2))[3]." - ".(caller(1))[3]." - ". (caller(0))[3]."]") if 
DEBUG;
  
    d("SQL: $do_sql") if DEBUG;
  
    $self->{sth} = $self->{dbh}->prepare($do_sql);
    $self->{sth}->execute();
  
      # Enough to set only one lock, unlock will remove them all
    $self->{lock} = 'On';
  
  } # end of sub sql_lock_tables
  
  
  
  ######################################################################
  # unlock all tables, unset internal flag to handle possible user
  # abortions, so the tables will be unlocked thru the END{} block
  #
  # sql_unlock_tables()
  #
  # _side_effect_: delete $self->{lock}
  #
  ##########################
  sub sql_unlock_tables{
    my $self   = shift;
  
      # we want to print the trace in debug mode
    d( "[".(caller(2))[3]." - ".(caller(1))[3]." - ". (caller(0))[3]."]") if 
DEBUG;
  
    $self->{dbh}->do("UNLOCK TABLES");
  
      # Enough to set only one lock, unlock will remove them all
    delete $self->{lock};
  
  } # end of sub sql_unlock_tables
  
  #
  #
  # return current date formatted for a DATE field type
  # YYYYMMDD
  #
  # Note: since this function actually doesn't need an object it's being
  # called without parameter as well as procedural call
  ############
  sub sql_date{
    my $self     = shift;
  
    my ($mday,$mon,$year) = (localtime)[3..5];
    return sprintf "%0.4d%0.2d%0.2d",1900+$year,++$mon,$mday;
  
  } # end of sub sql_date
  
  #
  #
  # return current date formatted for a DATE field type
  # YYYYMMDDHHMMSS
  #
  # Note: since this function actually doesn't need an object it's being
  # called without parameter as well as procedural call
  ############
  sub sql_datetime{
    my $self     = shift;
  
    my ($sec,$min,$hour,$mday,$mon,$year) = localtime();
    return sprintf 
"%0.4d%0.2d%0.2d%0.2d%0.2d%0.2d",1900+$year,++$mon,$mday,$hour,$min,$sec;
  
  } # end of sub sql_datetime
  
  
  # Quote the list of parameters.  Parameters consisting entirely of
  # digits (i.e. integers) are unquoted.
  # print sql_quote("one",2,"three"); => 'one', 2, 'three'
  #############
  sub sql_quote{ map{ /^(\d+|NULL)$/ ? $_ : "\'$_\'" } @_ }
  
  # Escape the list of parameters (all unsafe chars like ",' are escaped)
  # We make a copy of @_ since we might try to change the passed values,
  # producing an error when modification of a read-only value is attempted
  ##############
  sub sql_escape{ my @a = @_; map { s/([\'\\])/\\$1/g;$_} @a }
  
  
  # DESTROY makes all kinds of cleanups if the fuctions were interuppted
  # before their completion and haven't had a chance to make a clean up.
  ###########
  sub DESTROY{
    my $self = shift;
  
    $self->sql_unlock_tables() if $self->{lock};
    $self->{sth}->finish       if $self->{sth};
    $self->{dbh}->disconnect   if $self->{dbh};
  
  } # end of sub DESTROY
  
  # Don't remove
  1;
  
  
  
  1.1                  modperl-docs/src/docs/1.0/guide/code/lwp-bench.pl
  
  Index: lwp-bench.pl
  ===================================================================
  #!/usr/bin/perl -w
  
  use LWP::Parallel::UserAgent;
  use Time::HiRes qw(gettimeofday tv_interval);
  use strict;
  
  ###
  # Configuration
  ###
  
  my $nof_parallel_connections = 10;
  my $nof_requests_total = 100;
  my $timeout = 10;
  my @urls = (
            'http://www.example.com:81/perl/faq_manager/faq_manager.pl',
            'http://www.example.com:81/perl/access/access.cgi',
           );
  
  
  ##################################################
  # Derived Class for latency timing
  ##################################################
  
  package MyParallelAgent;
  @MyParallelAgent::ISA = qw(LWP::Parallel::UserAgent);
  use strict;
  
  ###
  # Is called when connection is opened
  ###
  sub on_connect {
    my ($self, $request, $response, $entry) = @_;
    $self->{__start_times}->{$entry} = [Time::HiRes::gettimeofday];
  }
  
  ###
  # Are called when connection is closed
  ###
  sub on_return {
    my ($self, $request, $response, $entry) = @_;
    my $start = $self->{__start_times}->{$entry};
    $self->{__latency_total} += Time::HiRes::tv_interval($start);
  }
  
  sub on_failure {
    on_return(@_);  # Same procedure
  }
  
  ###
  # Access function for new instance var
  ###
  sub get_latency_total {
    return shift->{__latency_total};
  }
  
  ##################################################
  package main;
  ##################################################
  ###
  # Init parallel user agent
  ###
  my $ua = MyParallelAgent->new();
  $ua->agent("pounder/1.0");
  $ua->max_req($nof_parallel_connections);
  $ua->redirect(0);    # No redirects
  
  ###
  # Register all requests
  ###
  foreach (1..$nof_requests_total) {
    foreach my $url (@urls) {
      my $request = HTTP::Request->new('GET', $url);
      $ua->register($request);
    }
  }
  
  ###
  # Launch processes and check time
  ###
  my $start_time = [gettimeofday];
  my $results = $ua->wait($timeout);
  my $total_time = tv_interval($start_time);
  
  ###
  # Requests all done, check results
  ###
  
  my $succeeded     = 0;
  my %errors = ();
  
  foreach my $entry (values %$results) {
    my $response = $entry->response();
    if($response->is_success()) {
      $succeeded++; # Another satisfied customer
    } else {
      # Error, save the message
      $response->message("TIMEOUT") unless $response->code();
      $errors{$response->message}++;
    }
  }
  
  ###
  # Format errors if any from %errors
  ###
  my $errors = join(',', map "$_ ($errors{$_})", keys %errors);
  $errors = "NONE" unless $errors;
  
  ###
  # Format results
  ###
  
  [EMAIL PROTECTED] = map {($_,".")} @urls;
  my @P = (
        "URL(s)"          => join("\n\t\t ", @urls),
        "Total Requests"  => $nof_requests_total * @urls,
        "Parallel Agents" => $nof_parallel_connections,
        "Succeeded"       => sprintf("$succeeded (%.2f%%)\n",
                                   $succeeded * 100 / ( $nof_requests_total * 
@urls ) ),
        "Errors"          => $errors,
        "Total Time"      => sprintf("%.2f secs\n", $total_time),
        "Throughput"      => sprintf("%.2f Requests/sec\n",
                                   ( $nof_requests_total * @urls ) / 
$total_time),
        "Latency"         => sprintf("%.2f secs/Request",
                                   ($ua->get_latency_total() || 0) /
                                   ( $nof_requests_total * @urls ) ),
       );
  
  my ($left, $right);
  ###
  # Print out statistics
  ###
  format STDOUT =
  @<<<<<<<<<<<<<<< @*
  "$left:",        $right
  .
  
  while(($left, $right) = splice(@P, 0, 2)) {
    write;
  }
  
  
  
  1.1                  
modperl-docs/src/docs/1.0/guide/code/mysql-3.22.29_backup.pl
  
  Index: mysql-3.22.29_backup.pl
  ===================================================================
  #!/usr/bin/perl -w
  
  # this script should be run from the crontab every night or in shorter
  # intervals. This scripts does a few things.
  # 1. dump all the tables into a separate dump files (these dump files 
  # are ready for DB restore)
  # 2. backups the last update log file and create a new log file
  
  
  use strict;
  my $data_dir = "/var/lib/mysql";
  my $update_log = "$data_dir/www.bar.com";
  my $dump_dir  = "$data_dir/dump";
  my $gzip_exec = "/bin/gzip";
  my @db_names = qw(bugs mysql bonsai);
  my $mysql_admin_exec = "/usr/bin/mysqladmin ";
  
      # convert unix time to date + time
  my ($sec,$min,$hour,$mday,$mon,$year) = localtime(time);
  my $time  = sprintf("%0.2d:%0.2d:%0.2d",$hour,$min,$sec);
  my $date  = sprintf("%0.2d.%0.2d.%0.4d",++$mon,$mday,$year+1900);
  my $timestamp = "$date.$time";
  
  # dump all the DBs we want to backup
  foreach my $db_name (@db_names) {
    my $dump_file = "$dump_dir/$timestamp.$db_name.dump";
    my $dump_command = "/usr/bin/mysqldump -c -e -l -q --flush-logs $db_name > 
$dump_file";
    system $dump_command;
  }
  
  # move update log to backup for later restore if needed
  rename $update_log, "$dump_dir/$timestamp.log" if -e $update_log;
  
  # restart the update log to log to a new file!
  `/usr/bin/mysqladmin refresh`;
  
  # compress all the created files
  system "$gzip_exec $dump_dir/$timestamp.*";
  
  
  
  
  1.1                  
modperl-docs/src/docs/1.0/guide/code/mysql-3.22.29_restore.pl
  
  Index: mysql-3.22.29_restore.pl
  ===================================================================
  #!/usr/bin/perl -w
  
  # this scripts restores the DBs
  
  # Usage: mysql.restore.pl update.log.gz dump.db1.gz [... dump.dbn.gz]
  # all files dump* are compressed as we expect them to be created by 
  # mysql.backup utility
  
  # example: 
  # % mysql.restore.pl myhostname.log.gz 12.10.1998.16:37:12.*.dump.gz
  
  # .dump.gz extension.
  
  use strict;
  
  use FindBin qw($Bin);
  
  my $data_dir   = "/var/lib/mysql";
  my $dump_dir   = "$data_dir/dump";
  my $gzip_exec  = "/bin/gzip";
  my $mysql_exec = "/usr/bin/mysql -f ";
  my $mysql_backup_exec = "$Bin/mysql.backup.pl";
  my $mysql_admin_exec  = "/usr/bin/mysqladmin ";
  
  my $update_log_file = '';
  my @dump_files = ();
  
  # split input files into an update log and the dump files
  foreach (@ARGV) {
    push(@dump_files, $_),next unless /\.log\.gz/;
    $update_log_file = $_;
  }
  
  die "Usage: mysql.restore.pl update.log.gz dump.db1.gz [... dump.dbn.gz]\n" 
    unless defined @dump_files and @dump_files > 0;
  
  # load the dump files
  foreach (@dump_files) {
  
      # check the file exists
    warn("Can't locate $_"),next unless -e $_;
  
      # extract the db name from the dump file
    my $db_name = $1 if /\d\d\.\d\d.\d\d.\d\d:\d\d:\d\d\.(\w+)\.dump\.gz/;
  
    warn("Can't extract DB name from the file name,
          probably an error in the file format"),
            next unless defined $db_name and $db_name;
  
      # we want to drop the table since restore will rebuild it!
      # force to drop the db without confirmation
    my $drop_command = "$mysql_admin_exec -f drop $db_name";
    system $drop_command;
  
    $drop_command = "$mysql_admin_exec create $db_name";
    system $drop_command;
  
      # build the command and execute it
    my $restore_command = "$gzip_exec -cd $_ | $mysql_exec $db_name";
    system $restore_command;
  }
  
  # now load the update_log file (update the db with the changes since
  # the last dump
  warn("Can't locate $update_log_file"),next unless  -e $update_log_file;
  
  my $restore_command = 
    "$gzip_exec -cd $update_log_file |$mysql_exec";
  system $restore_command;
  
  # rerun the mysql.backup.pl since we have reloaded the dump files
  # and update log , and we must rebuild backups!
  system $mysql_backup_exec;
  
  
  
  1.1                  
modperl-docs/src/docs/1.0/guide/code/mysql-3.22.30+_backup.pl
  
  Index: mysql-3.22.30+_backup.pl
  ===================================================================
  #!/usr/bin/perl -w
      
  # this script should be run from the crontab every night or in shorter
  # intervals. This scripts does a few things.
  # 1. dump all the tables into a separate dump files (these dump files 
  # are ready for DB restore)
  # 2. backups the last update log file and create a new log file
  
  #This script originates from the perl.apache.org site, but I have adapted it 
to work
  #properly with the newer versions of MySQL, where the log files are named 
differently
  #WVW 14/02/2000 [EMAIL PROTECTED]
      
  use strict;
  
  my $data_dir = "/var/lib/mysql";
  my $update_log = "$data_dir/central2.001";
  my $dump_dir  = "$data_dir/backup";
  my $gzip_exec = "/bin/gzip";
  my @db_names = qw(mysql besup);
  my $mysql_admin_exec = "/usr/bin/mysqladmin ";
  my $hostname = "central2";
  
  my $password = "babedb";
      
  # convert unix time to date + time
  my ($sec,$min,$hour,$mday,$mon,$year) = localtime(time);
  my $time  = sprintf("%0.2d:%0.2d:%0.2d",$hour,$min,$sec);
  my $date  = sprintf("%0.2d.%0.2d.%0.4d",++$mon,$mday,$year+1900);
  my $timestamp = "$date.$time";
      
  # dump all the DBs we want to backup
  foreach my $db_name (@db_names) {
    my $dump_file = "$dump_dir/$timestamp.$db_name.dump";
    my $dump_command = "/usr/bin/mysqldump -c -e -l -q --flush-logs -p$password 
$db_name > $dump_file";
    system $dump_command;
  }
  
  mkdir "$dump_dir/$timestamp.log", 0;
  `mv $data_dir/$hostname.[0-9]* $dump_dir/$timestamp.log`;
      
  # move update log to backup for later restore if needed
  #rename $update_log, "$dump_dir/$timestamp.log" if -e $update_log;
  
  # restart the update log to log to a new file!
  `/usr/bin/mysqladmin refresh -p$password`;
  
  # compress all the created files
  system "$gzip_exec $dump_dir/$timestamp.log/*";
  system "$gzip_exec $dump_dir/$timestamp.*.dump*";
  
  
  
  1.1                  
modperl-docs/src/docs/1.0/guide/code/mysql-3.22.30+_restore.pl
  
  Index: mysql-3.22.30+_restore.pl
  ===================================================================
  #!/usr/bin/perl -w
      
  # this scripts restores the DBs
      
  # Usage: mysql.restore.pl update.log.gz dump.db1.gz [... dump.dbn.gz]
  # all files dump* are compressed as we expect them to be created by 
  # mysql.backup utility
      
  # example: 
  # % mysql.restore.pl myhostname.log.gz 12.10.1998.16:37:12.*.dump.gz
      
  # .dump.gz extension.
      
  use strict;
      
  use FindBin qw($Bin);
      
  my $data_dir   = "/var/lib/mysql";
  my $dump_dir   = "$data_dir/backup";
  my $gzip_exec  = "/bin/gzip";
  my $mysql_exec = "/usr/bin/mysql -f -pbabedb";
  my $mysql_backup_exec = "$Bin/mysql_backup.pl";
  my $mysql_admin_exec  = "/usr/bin/mysqladmin -pbabedb";
      
  my $update_log_dir = '';
  my @dump_files = ();
      
  # split input files into an update log and the dump files
  foreach (@ARGV) {
    push(@dump_files, $_),next unless /\.log/;
    $update_log_dir = $_;
  }
      
  die "Usage: mysql.restore.pl update.log.dir dump.db1.gz [... dump.dbn.gz]\n" 
    unless defined @dump_files and @dump_files > 0;
      
  # load the dump files
  foreach (@dump_files) {
      
    # check the file exists
    warn("Can't locate $_"),next unless -e $_;
      
    # extract the db name from the dump file
    my $db_name = $1 if /\d\d\.\d\d.\d\d\d\d.\d\d:\d\d:\d\d\.(\w+)\.dump\.gz/;
      
    warn("Can't extract DB name from the file name,
          probably an error in the file format"),
          next unless defined $db_name and $db_name;
      
    # we want to drop the table since restore will rebuild it!
    # force to drop the db without confirmation
    my $drop_command = "$mysql_admin_exec -f drop $db_name";
    system $drop_command;
      
    $drop_command = "$mysql_admin_exec create $db_name";
    system $drop_command;
      
    # build the command and execute it
    my $restore_command = "$gzip_exec -cd $_ | $mysql_exec $db_name";
    system $restore_command;
  }
      
  # now load the update_log file (update the db with the changes since
  # the last dump
  warn("Can't locate $update_log_dir"),next unless  -d $update_log_dir;
      
  my $restore_command = 
    "$gzip_exec -cd $update_log_dir/* |$mysql_exec";
  system $restore_command;
      
  # rerun the mysql.backup.pl since we have reloaded the dump files
  # and update log , and we must rebuild backups!
  system $mysql_backup_exec;


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

cvs commit: modperl-docs/src/docs/1.0/guide/code DB_File-Lock2.pm My-DB.pm lwp-bench.pl mysql-3.22.29_backup.pl mysql-3.22.29_restore.pl mysql-3.22.30+_backup.pl mysql-3.22.30+_restore.pl

Reply via email to