Jin,

   Are you automating your process?  If no, my experience
my be helpful.  Also, you might get ideas from some of the
Analog helper applications at "http://www.analog.cx/helpers/";.

   I run Analog 5.32 on Solaris 8.  Apache 2.0.50 runs on
the same machine.  My logformat is vhost_combined.

   I use the Apache logfile splitter to create separate log files
for each virtual host.  I compress the original logfile and save it.
I delete the separate log files after I run Analog.  The process
of automating the logfile splitter is easy and the files that are
produced have predictable names.

   To create Analog configuration files, I use a script that parses
the Apache configuration file.  It identifies each virtual host and
extracts relevant information.  Then the script creates an Analog
configuration file for each virtual host with only basic information:
LOGFILE, HOSTNAME, HOSTURL, OUTFILE, FILEINCLUDE
All of the configuration files created by the script exist in a directory
that contains no other files.  I also use a "master" configuration file
that has a large number of directives that are consistent for all of
the virtual hosts.

   When I am ready to run Analog, I have a script that looks for
the Analog configuration files, then calls Analog to run on each of
the files while also reading in the master configuration file.  So, the
output is produced virtual host by virtual host.  I repeat this process
once a week for 80 hosts.  I think that a similar process, even with
a different operating system and server software, will work for you.

   Before I automated scripts, I worked with Analog for a long time
to learn all of the commands that are useful for my environment.  This
way, when I started scripting, it was easy to determine if a problem
was caused by poor scripting or by poor use of Analog's commands.
A lot of work was needed to establish the process, but it is reliable
50 times out of 52.  If I tweak my scripts, I might get 100% reliability.

I hope this helps,

-- Duke


Jin Zhao wrote:

This seems a dumb question but I do think it can be valueble if analog can do it.

In our site setup, we always have all virtual hosts logging to one big access_log. This log grows fast (2 million lines per day) and get rotated and compressed nightly. The boss want to know how many "visitors" visited his sites. Everybody knows this is something stupid but i have to give him some numbers, say distinct hosts might be good enough.

The problem is, in order for analog to get distinct hosts for each virtual hosts, I have to split these huge rotated and compressed log files into hundreds of vhost based smaller files. The worse is that I have to create an ananlog configuration for each vhost and run analog&reportmagic against these hundres of smaller log files for hundresd of times.

Dis I miss something useful in current analog features? Can somebody give a better solution to it?

Thanks,


Jin

begin:vcard
fn:Duke Hillard
n:Hillard;Duke
org:University of Louisiana at Lafayette;University Computing Support Services
adr:;;P.O. Box 42770;Lafayette;LA;70504-2770;USA
email;internet:[EMAIL PROTECTED]
title:University Webmaster
tel;work:337.482.5763
url:http://www.louisiana.edu/
version:2.1
end:vcard

+------------------------------------------------------------------------
|  TO UNSUBSCRIBE from this list:
|    http://lists.meer.net/mailman/listinfo/analog-help
|
|  Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
|  List archives:  http://www.analog.cx/docs/mailing.html#listarchives
+------------------------------------------------------------------------

Reply via email to