I've run into similar problems, here's what we've done.

One of our sites generates about a gig of logfiles per day, the solution
that we have used is the following.

First off the logfiles are stripped of unnecessary data, i.e. gifs, jpgs,
internal traffic, etc..  This actually shrunk the logfiles to 25% of their
size.  Even though we had exclude directives in the analog.cfg file for
these things, it's amazing how much faster this made things, as well as how
much less ram we are using to process.

The next thing we have done is to turn off analog DNS and instead re-write
the files with the ip converted to hostnames.  We have to do this because we
actually have multiple processes using these logfiles, and instead of having
each one of these processes having to resolve hostnames, I found it much
faster to convert the logfiles once, and then not have to resolve ip's again
on that file.

These couple things that I've done down here have tremendously reduced the
resources required.  I now do one weeks logfiles, which is approximately 1.5
million page views not hits, in about 10 minutes with an origin 200 w/1gig
of ram.

--shak

 Shakeel Sorathia
    Unix Team
[EMAIL PROTECTED]
  626-660-3502

> -----Original Message-----
> From: Emmett Hogan [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, December 23, 1999 9:15 AM
> To: Analog Help List
> Subject: [analog-help] Trouble Processing *REALLY* Large Logfiles
> 
> 
> Hello,
> 
> I seem to be having a problem processing very large log files.  I am
> using DNStran to preload the dns cache file, and that sped things up
> immensely.  But now, analog just seems to go to "sleep" and do nothing
> after running for a while.
> 
> Here are the particulars:
> 
> Machine: 6 Processor Sun E450, Solaris 2.6, 2 GBytes memory
> Log files: 5 logfiles produced daily, each approx 200M
> 
> The problem seems to be connected to memory, notice the size of the
> analog process:
> 
>   PID USERNAME  PRI NICE   SIZE   RES STATE   TIME   WCPU    
> CPU COMMAND
> 26135 root       1   60   1916M 1181M sleep  50:36  0.72%    analog
> 
> Needless to say, when I am processing a month's worth of logs that's a
> BOATLOAD of data.  The process above was only processing 1 
> week of data.
> 
> I realize that I could create an analog cache and use that, but I
> would lose some data.  
> 
> My question is, would it solve this problem?
> 
> Also, if I went to using a cache, would I:
> 
>       1) Backup the existing cache.
>       2) Process the latest log files plus the cache
>       3) Move the latest log files out of the way
>       
> Each day?
> 
> This is killing me because I pushed for using analog over wusage
> because of the speed of Analog. :-(
> 
> TIA,
> -Emmett
> 
> 
> -- 
> Emmett Hogan, GNAC
> ------------------------------------
> What's going on, Normie?
> My birthday, Sammy. Give me a beer, stick a candle in it, and 
> I'll blow out my liver.
> ------------------------------------
> --------------------------------------------------------------
> ----------
> This is the analog-help mailing list. To unsubscribe from this
> mailing list, send mail to [EMAIL PROTECTED]
> with "unsubscribe analog-help" in the main BODY OF THE MESSAGE.
> List archived at 
http://www.mail-archive.com/[email protected]/
------------------------------------------------------------------------
------------------------------------------------------------------------
This is the analog-help mailing list. To unsubscribe from this
mailing list, send mail to [EMAIL PROTECTED]
with "unsubscribe analog-help" in the main BODY OF THE MESSAGE.
List archived at http://www.mail-archive.com/[email protected]/
------------------------------------------------------------------------

Reply via email to