Here is how to integrate Webalizer into Tomcat.
In case your wondering , webalizer is a real nice and free web log
analysis tool which prints out some nice charts and graphics on your
site usage.  It come bundled in RedHat when you install Apache.

1.  Include jsp files in the webalizer processing by making the
following entry in webalizer.conf:
        PageType        jsp

2.  Tell webalizer to place the output under Tomcat webapps with the
following entry:
        OutputDir       [your path to webapps]/usage


4. Create a context in Tomcat for the weablizer output with:
        <!-- Usage Context -->
        <Context path="usage" docBase="usage" debug="0" priviledged="true"/>


5.  Next tell Webalizer the name of the logfile you want it to parse:
        LogFile        [path to Tomcat logs]/access/access.log

   
6. Here is the important part.  By default Tomcat will not generate a
logfile compatible with Webalizer.  Http daemons (ie Apache) use a log
file format called the comon log file format (CLF), Tomcat does not. 
However the logging system in Tomcat is very flexible and extensible. 
You have to explicitly tell Tomcat you want it to create a logfile in
the CLF format. This is done by making a <Valve> entry in you server.xml
file which can either go under <Host>, <Engine> or <Context>:

        <!-- Access Logger -->
      <Valve className="org.apache.catalina.valves.AccessLogValve"
              directory="/var/log/tomcat4/access"
              prefix="access"
              suffix=".log"
              resolveHosts="true"
              pattern="combined"/>

7.  Almost there.  Heres the tricky part.  The valve entry above creates
a file which is "rolled out"  each day menaing it starts a new file each
day.  This is a nice feature, but it means that the file has a different
name each day.  For the entry above the log file for today would be
named access2002-10-15.log  Unfortunately you cant use dynamic logfile
names in Webalizer.  Remember the entry we made in step 3?  Notice the
file name is access.log and not access2002-10-15.log? Heres the fix. 
Webalizer is run via cron on Linux and via whatever the hell Windows
uses.  As part of my linux setup there is a file in the /etc/cron.daily
directory called 00Webalizer This file gets run everyday to call
Webalizer to go and parse the logfile to make the nice graphs.  In that
file add some scripting which will copy your daily logfile to a file
called access.log then call webalizer to parse that file, then delete
access.log and then archive or do whatever you want with the original
logfile.  Here is a simple bash script which does this (the script is
not perfect but it works for now):

        #! /bin/bash
        LOGFILE=/var/tomcat4/logs/access/access`date +%Y"-"%m"-"%d`.log
        cp $LOGFILE /var/tomcat4/logs/access/access.log
        /usr/bin/webalizer
        rm -f /var/tomcat4/logs/access/access.log
        exit 0

8.  Lastly you may want to tell webalizer to ignore any hits it receives
to the /usage webapp.  This is done in the webalizer.conf file with the
entry:
        IgnoreURL       /usage*


Thats all there is to it.  I did this yesterday in about 30 minutes,
including trying to figure it out as I went.  It should only take you 15
minutes to do this.  I hope this helps you.

Cheers 
Dave Patton


--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to