>From what I see you may want more than one table.  One that stores your
log information.  This table might contain "log date","logged by","log
statement", "etc.".  Then create another table that summarizes/counts
how many times you find whatever it is you wish to count.  For instance,
if it is a particular error message you want to count, you don't want to
include "log date" and "logged by" information.  If "log Date" measeures
time to the microsecond most likely all your counts will equal 1.  I
don't really know what your logging purpose is though. 

Have you looked at using log4perl?  Log4perl will log directly to a
database.  Then you could write a program to summarize your logs in
another table. I would look at that.

-----Original Message-----
From: George Bills [mailto:[EMAIL PROTECTED] 
Sent: Friday, March 30, 2007 1:20 AM
To: [email protected]
Subject: Updating counts

Hi all;

Sorry if I'm asking a silly question - I'm very inexperienced with
databases. I'm playing around with using DBI / SQLite as a back-end for
parsing and storing log files - I have multiple columns like "log date",
"logged by", etc. I parse these in a Perl script, and then store them in
the SQLite database. I would like to increment a "count" column if I
parse the same log line twice: on the first time I see that log line, I
parse it and store it in its own row, and on subsequent occasions I
increment a "count" column to say that I have seen that log line "count"

amount of times. Some quick Googling has shown some stuff to do with
auto increments, but that seemed to be much more to do with primary keys
and wasn't guaranteed to be in order / contiguous. I've also seen MySQL
syntax like "ON DUPLICATE KEY UPDATE ..." which looks good, but I'd like
to avoid an extra dependency on a database that has to be set up and
maintained (hence SQLite).

My preference for a solution would be something that's simple and
maintainable primarily, followed by fast (for reads first, writes
second) - but I don't have the database experience to know what the best
solution is. Is there an easy way? Should I be storing duplicate rows
and counting them at display time?



Using hashes, this works something like this (but I have more data than
I can safely store in memory with a hash, and sorting and merging
subhashes on top of this makes things tricky):

my %lineCount;
while (<>) {
    chomp();
    $lineCount{$_}++;
}



Using DBI, the best I've been able to get has been this:
my $dbh = DBI->connect("DBI:SQLite:dbname=mydb.sqldb","","");
$dbh->do("CREATE TABLE $table (line  CHAR(100) PRIMARY KEY,
                               count INT)"); # read data
while(<>) {
    chomp();
    my $seenCount = 0;
    my $rowRef =
        $dbh->selectall_arrayref("SELECT * FROM $table WHERE line = ?",
                                 undef, $_);
    unless (@$rowRef) {
        # don't insert row if it's already inserted
        $dbh->do("INSERT INTO $table VALUES(?, ?)",
                 undef, $_, $seenCount);
    }
    $dbh->do("UPDATE $table SET count = count + 1 WHERE line = ?",
             undef, $_);
}



Thanks in advance for any help.

Reply via email to