Toddy Prawiraharjo wrote: > > From: Rob Dixon [mailto:[EMAIL PROTECTED] > Sent: Friday, 29 September 2006 10:04 AM > To: beginners@perl.org > Subject: Re: Querying very big database >> >> Toddy Prawiraharjo wrote: >> >>> Hello all, >>> >>> I am not sure if i'm inside the correct mailing list, but it should be >>> simple for you guys. Recently i started a simple perl program to catch all >>> syslog from my netscreen firewall, and put them into a database, so later >>> can do some analysing on it (sort of AWStats). the database itself is pretty >>> big, with 600k entry (for 3 weeks of the running firewall) worth 80 megs in >>> mysql. My question is, during the analysing my perl script giving very slow >>> processing. I know the query to mysql itself is pretty quick, less than 2 >>> seconds to return about 40k to 60k result to perl to be analysed >>> >>> Here it goes: >>> <snipped> >>> $query = "SELECT src, rcvd FROM no_name WHERE start_date >= >>> '$fromdate' AND start_date <= '$todate';"; >>> $sth = $dbh->prepare($query); >>> $sth -> execute() || &errorhere("SQL Error: $DBI::errstr"); >>> print "$query<br>"; >>> >>> my %src = (); >>> my ($totalsent, $totalrcvd, $srcchecked); >>> >>> while(@row = $sth->fetchrow_array){ >>> $srcchecked = 0; >>> while($source, $total = each (%src)){ >>> if ($source eq "$row[0]"){ >>> $srcchecked = 1; >>> $src{$source} = $src{$source} + $row[1]; >>> } >>> } >>> if ($srcchecked != 1){ >>> print "$row[0]<br>"; >>> $src{$row[0]} = $row[1]; >>> } >>> $totalrcvd = $totalsent + $row[1]; >>> #$totalrcvd = $totalrcvd + $row[2]; >>> } >>> </snipped> >>> >>> The while loop to do analysing on the data take more than 15 minutes, and >>> that only to a query for 1 day long records($fromdate-$todate) So, if I >>> want perl to give me faster result, what's the solution? Did I make >>> fundamentally wrong approach? >> >> >> You've missed the whole point of hashes, which is that they will access >> a data >> value directly from a key. What you've written is similar to >> >> my $i = 0; >> my $value; >> foreach my $v (@array) { >> if ($i++ == $n) { >> $value = $v; >> } >> } >> >> instead of >> >> $value = $array[$n]; >> >> Try this while loop instead of the one you have and see if you get an >> improvement: >> >> while (@row = $sth->fetchrow_array) { >> >> my $key = $row[0]; >> >> print "$key<br>" unless exists $src{$key}; >> $src{$key} += $row[1]; >> >> $totalsent += $row[1]; >> #$totalrcvd += $row[2]; >> } >> >> HTH, >> >> Rob >> > > THX! I always knew it's my n00b scripting that caused the problem. It > now down to 4 lines inside while loop, and I'm flying! The processing > down from ~15 minutes to 4 secs! But, for longer date range (2 weeks > time limit) it clocked at 25 wallclock secs (with processing ~350k mysql > return entries). Any more way to process this beast any faster? Any good > reading or reference about this? Just afraid, if this rolled to > production level, the report for months activities will take minutes to > make! > > <snipped> > $query = "SELECT src, sent, rcvd FROM no_name WHERE start_date > >= '$fromdate' AND start_date <= '$todate';"; > > $sth = $dbh->prepare($query); > $sth -> execute() || &errorhere("SQL Error: $DBI::errstr"); > #print "$query<br>"; > > my %src = (); > my ($totalsent, $totalrcvd); > > while(@row = $sth->fetchrow_array){ > my $key = $row[0]; > $src{$key} += $row[2]; > $totalsent = $totalsent + $row[1]; > $totalrcvd = $totalrcvd + $row[2]; > } > </snipped>
Hi Toddy (Please bottom-post your replies. One day everybody in the world will know to do this and perl.beginners will be a happy, shiny place!) Let the database engine do it for you! my $query = qq( SELECT src, SUM(sent) AS totalsent, SUM(rcvd) AS totalrcvd FROM no_name WHERE start_date >= '$fromdate' AND start_date <= '$todate' GROUP BY src ); my $sth = $dbh->prepare($query); $sth->execute() || errorhere("SQL Error: $DBI::errstr"); my $src = $sth->fetchall_hashref('src'); foreach my $srcval(keys %$src) { $totalsent += $src->{$srcval}{totalsent}; $totalrcvd += $src->{$srcval}{totalrcvd}; } (I couldn't easily test this, although it is syntax checked, so beware) Now, $src is a reference to a hash similar to your original %src, but has another hash reference as its values, so you can extract my $totalsent = $src{src1}{totalsent}; and so on. HTH, Rob -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>