I've made my the Python script that I've been using to do log analysis publicly available:
http://zak.ecotroph.net/~anewton/cla/


The README file is attached so that people may determine their interest.

-andy


Courier Log Analyzer
====================
I did not develop this software to take the place of "grep" for specific
incident troubleshooting.  Its purpose is to allow a "big picture" view
of the workings of a courier mail system.  By analyzing mail flow, I 
have discovered and fixed a couple of misconfigurations in my setup of
Courier.

This program is released under the GNU GPL.  See LICENSE.TXT.

Usage
-----
This is a Python program.  It should work under Python 2.0 or greater.
It outputs wide text reports to stdout.  The typical use to get the
summary reports is like this:

  $ cla.py maillog.*

After viewing the daily summaries, you may want to focus in on a
particular day.  That can be done like this:

  $ cla.py -D apr-4 maillog.*

The program has many options which may be set using a configuration
file using the '-c' option.  The '-h' prints all the command line
switches and the configuration file format.

Some people might want to turn off reverse-DNS lookups on hosts using
the '-n' option.  Currently, this program does not do any caching or
threading of these lookups.  So leaving reverse-DNS lookups on will
slow down the program and cause multiple recursive lookups on your DNS
servers. A simple solution to that is to install a local caching server
such as bind or dnscache.

Performance
-----------
At this point in time, I have not really focused on performance.
However, I did run this analyzer over a 527M log file to see what
would happen.  On an iMac with a PowerPC G4 1GHz, it took 14m20s
to complete (using the -n switch to turn off reverse lookups).
The memory size for that run peeked at 54.6M virtual and 41.9M real.

Reports
-------
All of the reports can be enabled or disbled.  By default, the daily
detailed reports are disabled.  In addition, the thresholds used by the
summary reports can be set in the configuration file.  Larger sites
will probably want to set them higher than the defaults.

SMTP Summary by day:
o connections
o local deliveries
o total delivery size
o # of local errors
o SMTP relays
o total relay size
o SMTP errors
o broken pipes
o 5XX errors
o freemail errors

SMTP Statistical Summary by day:
o same as above except gives difference from median

Delivery by domain:
o local deliveries
o median difference of local deliveries
o total delivery size
o median difference of delivery size
o freemail count

Relays by domain:
o SMTP relays
o median difference of relays
o total relay size
o median difference of size
o in-bound relays
o out-bound relays
o freemail count

Hosts:
o SMTP connections
o median difference of connections
o SMTP error count
o median difference of error count
o error ratio (errors-to-connections)
o 5XX error count
o median difference of 5XX count
o 5XX error ratio (5XX errors-to-connections)

IMAP Summary by day:
o IMAP connections
o median difference of connections
o IMAP users
o median difference of users
o IMAP logins
o median difference of logins

Detailed Daily:
o IMAP connections
 - IP Address
 - # of connections
o IMAP logins
 - sorted by user
 - IP Address
 - # of times per address
o SMTP connections
 - IP Address
 - # of connections
o SMTP local deliveries
 - e-mail address
 - # of message by address
 - total size by address
o SMTP local errors
o SMTP relays
 - e-mail address
 - total count by address
 - total size by address
o SMTP errors
o SMTP broken pipes

Forgiveness, Comments, & Bugs
-----------------------------
Much of the code for this analyzer is raw, uncommented, and in-need of
better form.  However, this program also served as a get-to-know Python
learning tool.

If you have comments or find bugs, please send them to:
  [EMAIL PROTECTED]

Reply via email to