Emma,
It is just as easy to use a script to slice and dice the data afterwards.
I have attached an example script that will perform the expansion you want
that we use on our logs. The only caveat is that is your DNS is dynamic,
resolving them after the fact is not accurate. Let me know if this is any
help.

Any errors in this script, while mine, become the problem of whoever uses
it. It is provided with no guarantee.

                        Bill Knox
                        Senior Operating Systems Programmer/Analyst
                        The MITRE Corporation

On Tue, 4 Mar 2003, Emma Jane Hogbin wrote:

> Date: Tue, 4 Mar 2003 15:10:52 -0500
> From: Emma Jane Hogbin <[EMAIL PROTECTED]>
> To: "ht://dig" <[EMAIL PROTECTED]>
> Subject: [htdig] IP addresses in logs
>
> When I turn htsearch logging on all I'm capturing is the IP of the web
> server. Is there a way to capture the IP of the person who entered the
> search term into the web server (at the ht://dig end).
>
> Examples from the logs on the search box:
>
> syslog:
>
> Mar  4 14:53:36 medusa htsearch[31973]: 10.0.0.1 [en] (and) [humna rights]
> [humna and rights] (0/15) - 1 --
> http://www.foreign-policy-dialogue.ca/en/answers/index.php
> Mar  4 14:53:59 medusa htsearch[31974]: 10.0.0.1 [en] (and) [human rights]
> [human and rights] (95/15) - 1 --
> http://www.foreign-policy-dialogue.ca/en/answers/index.php
>
>
> apache server log:
>
> 10.0.0.1 - - [02/Mar/2003:06:59:38 -0500] "GET
> /cgi-bin/htsearch?config=en&restrict=&words=policies HTTP/1.0" 200 29549
> "http://search.foreign-policy-dialogue.ca/cgi-bin/htsearch?config=en&SESSION=f5e64
> +a5fb56e9f0a5aa39a554eeb15d2&restrict=&words=returning+residents"
> "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
> 10.0.0.1 - - [02/Mar/2003:12:40:50 -0500] "GET
> /cgi-bin/htsearch?config=en&SESSION=3906375feb5fbaeacc9bfad87862e931&restrict=&wor
> +ds=staff
> HTTP/1.0" 200 25771
> "http://www.foreign-policy-dialogue.ca/en/sitemap/index.html"; "Mozilla/4.0
> (compatible; MSIE 6.0; Windows NT 5.0)"
>
>
>
> --
> Emma Jane Hogbin
> [[ 416 417 2868 ][ www.xtrinsic.com ]]
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: Etnus, makers of TotalView, The debugger
> for complex code. Debugging C/C++ programs can leave you feeling lost and
> disoriented. TotalView can help you find your way. Available on major UNIX
> and Linux platforms. Try it free. www.etnus.com
> _______________________________________________
> htdig-general mailing list <[EMAIL PROTECTED]>
> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe
> FAQ: http://htdig.sourceforge.net/FAQ.html
>
#!/usr/local/bin/perl -w
# $Id: ResolveIPinLogs.pl,v 1.4 2000/01/04 16:51:04 wknox Exp $
#
# $Log: ResolveIPinLogs.pl,v $
# Revision 1.4  2000/01/04 16:51:04  wknox
# Added a flag (-c) to perform a count of unique IP addresses, printed to
# standard out, instead of resolving addresses.
#
# Revision 1.3  1999/11/10 21:49:56  wknox
# Refined the regular expression which determines if the part of the line
# to look up is really an IP address
#
# Revision 1.2  1999/11/04 18:21:32  wknox
# Added capability to resolve multiple fields on each line.
#
# Revision 1.1  1999/11/04 00:09:26  wknox
# Initial revision
#

use strict;
use Socket;
use File::Basename;
use Getopt::Std;
umask 002;
use vars qw($opt_f $opt_o $opt_g $opt_d $opt_c);

my %dns_names = ();
my @fields_to_check;
my $new_file_dir = "resolved";
my $owner_uid = $<;
my $group_gid = $(;

# Check for the use of the flags -f, -g, -o, -d, and -c
# The -f flag indicates which field in the log contains the IP address, and 
# is required. The -o and -g flags indicate the ownership and group of the
# output files, respectively, and, if not provided, default to the real uid
# and gid of the owner of the process. The -d flag allows for an alternate
# directory other than "resolved" to be indicated for the resolved files.
# The -c flag allows for only a count of unique IP addresses without resolving
# them.
getopts('f:o:g:d:c');

if (defined $opt_f and $opt_f =~ /\d+/) {
        @fields_to_check = split /,/, $opt_f;
        foreach (@fields_to_check) {
                --$_;
        }
}
else {
        print "usage: ResolveIPinLogs.pl -f (field number,field number,...) [-c] [-d 
directory] [-o owner] [-g group] logfile1 logfile2...\n";
        exit 1;
}

if (defined $opt_o) {
        $owner_uid = (getpwnam($opt_o))[2];
        unless (defined $owner_uid) {
                print "Invalid username $opt_o\n";
                exit 1;
        }
}

if (defined $opt_g) {
        $group_gid = (getgrnam($opt_g))[2];
        unless (defined $group_gid) {
                print "Invalid group $opt_g\n";
                exit 1;
        }
}

if (defined $opt_c) {
        $new_file_dir = "IPcount";
}

if (defined $opt_d) {
        $new_file_dir = $opt_d;
}

foreach my $log_file (@ARGV) {
        # Keep the customer informed of your progress
        if (! defined $opt_c) {
                print "Resolving $log_file\n";
        }
        else {
                print "Counting IP addresses in $log_file\n";
        }

        # Set the name of the file to be created and where to put it
    my ($output_file, $current_dir, $gzipped) = fileparse ($log_file, ".gz");
        my $output_dir;
        if ($new_file_dir =~ /^\//) {
                $output_dir = $new_file_dir;
        }
        else {
                $output_dir = "$current_dir$new_file_dir";
        }
        $output_file = "$output_dir/$output_file";

        # Open the original log file for reading (autodetecting if it was gzipped
        # or not as well, based on presence of gz suffix)
        if ($gzipped) {
                if (-r $log_file) {
                        open LOGFILE, "/usr/local/bin/zcat $log_file |"
                                or die "Unable to open $log_file for reading: $!\n";
                }
                else {
                        print "File $log_file unreadable\n";
                        exit 1;
                }
        }
        else {
                open LOGFILE, $log_file
                        or die "Unable to open $log_file for reading: $!\n";
        }

        # Check to see if output directory exists - if not, create it and give it
        # the proper permissions
        unless (-d $output_dir or defined $opt_c) {
                print "Making directory $output_dir\n";
                mkdir $output_dir, 00775
                        or die "Unable to make directory $output_dir: $!\n";
                chown ($owner_uid, $group_gid, $output_dir)
                        or warn "Unable to change ownership for $output_dir: $!\n";
        }

        # Open the file to which the resolved log will get written
        unless (defined $opt_c) {
                open RESOLVED_LOG, "> $output_file"
                        or die "Unable to open $output_file for writing: $!\n";
                print "Writing results to $output_file\n";
        }

        while (<LOGFILE>) {
                chomp;
                my @line = split;
                foreach my $field_to_check (@fields_to_check) {
                        my $address = $line[$field_to_check];

                        # Check to make sure that the line has an IP address in that 
field
                        # If not, just write it out straight
                        # Regexp is 3 instances of 1-3 digits followed by a period, 
                        # followed by 1-3 digits
                        unless (defined $address and $address =~ 
/^(\d{1,3}\.){3}?\d{1,3}$/) {
                                next;
                        }

                        # Check to see if we already have seen it...
                        if (defined $dns_names{$address}) {
                                $line[$field_to_check] = $dns_names{$address};
                        }
                        # ...and check to see if we are only counting unique IPs...
                        elsif (defined $opt_c) {
                                $dns_names{$address} = "";
                        }
                        # ...and if not, look it up
                        else {
                                my $packed_ip = inet_aton($address);
                                my $hostname = gethostbyaddr($packed_ip, AF_INET);
                                # If we got a hostname back, set the hash with it
                                if (defined $hostname) {
                                        $dns_names{$address} = $hostname;
                                        $line[$field_to_check] = $dns_names{$address};
                                }
                                # If no hostname came back, set the hash with the IP 
address to
                                # prevent further lookups
                                else {
                                        $dns_names{$address} = $address;
                                }
                        }
                }

                # Shove the line back together and print it to the output file
                unless (defined $opt_c) {
                        my $new_line = join ' ', @line;
                        print RESOLVED_LOG "$new_line\n";
                }
        }
        close LOGFILE;
        unless (defined $opt_c) {
                close RESOLVED_LOG;
        }

        unless (defined $opt_c) {
                # If the original file was gzipped, gzip the output file and set the
                # proper ownership
                if ($gzipped) {
                        print "GNU zipping $output_file\n";
                        system ("/usr/local/bin/gzip $output_file");
                        chown ($owner_uid, $group_gid, "$output_file.gz")
                                or warn "Unable to change ownership for 
$output_file.gz: $!\n";
                }
                # Otherwise, just set the ownership
                else {
                        chown ($owner_uid, $group_gid, $output_file)
                                or warn "Unable to change ownership for $output_file: 
$!\n";
                }
        }
}

if (defined $opt_c) {
        print "Total unique IP addresses: ", scalar keys %dns_names, "\n";
}

exit 0;

Reply via email to