Mick Burrell <[EMAIL PROTECTED]> wrote:
> On 1 May 2007, at 22:07, Aengus wrote:
>
>> Mick Burrell <[EMAIL PROTECTED]> wrote:
>>> I have a site on a server which produces log files in what seems to
>>> be plain text (type .log) and compressed (type .gz). Analog handles
>>> these just fine but the server only keeps them for three weeks. I'd
>>> like to be able to download these on a weekly basis and merge the
>>> files to produce a monthly (or longer period) report. I realise I
>>> can't just duplicate the files or I'd get repeat entries.
>>
>> You don't need to merge the logfiles - Analog can read multiple
>> logfiles.
>
> So if I download a file containing data for weeks 1, 2 and 3 and the
> next week download the file which by then will contain weeks 2, 3 and
> 4, do you mean that Analog will not count double for weeks 2 and 3?

No. If you're downloading overlapping data, then you'd have to remove 
the overlap yourself. Analog will assume that overlapped data is coming 
from different web servers, and will not "filter it out". I'm not sure 
why you would need to do that though - don't you have access to "closed" 
logfiles? Most web servers "rotate" their logs either on a fixed 
schedule (hourly, daily weekly or monthly) or when the logfile reaches a 
fixed size. While you might want to include the "live" logfile in your 
analysis (especially if the rotation schedule is fairly long), you 
shouldn't archive these "live" logs, just the ones that have been 
rotated.

>From your initial description, I imagine that the .gz files are the 
"rotated" logs - they are "done" and don't overlap. You could consider 
the .log files as "temporary", constantly being updated until the end of 
the day, when it is gzipped, and a new .log is created. In that 
situation, you only want to archive the .gz files.

That's a fairly common scenario for managing web server logs - I'd be 
surprised if your server is using .log and .gz files any differently.

>> The simplest solution is add the "historical" files to a zip file,
>> and just run Analog against this zip file.
>
> The files are small at present but I'm not sure I understand what you
> mean by this. Sorry - I'm no doubt being dim.

If you have daily log files, it's often handy to stick a months worth of 
logs into a zip file, so that you have one file of 20MB, instead of 30 
files of 10-15MB each. If you want to archive your log files on a 
network drive, Analog will run much faster reading a 20MB .zip file over 
the network than 400MB of plain text logfiles. (Log files will often 
achieve 20:1 compression).

Even if the size of the logs isn't an issue, if the only reason you're 
keeping them is to run Analog against them, then collecting them in zip 
files can be a tidy way to maintain archived logs.

Aengus




+------------------------------------------------------------------------
|  TO UNSUBSCRIBE from this list:
|    http://lists.meer.net/mailman/listinfo/analog-help
|
|  Analog Documentation: http://analog.cx/docs/Readme.html
|  List archives:  http://www.analog.cx/docs/mailing.html#listarchives
|  Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------

Reply via email to