Mick Burrell <[EMAIL PROTECTED]> wrote: > On 1 May 2007, at 22:07, Aengus wrote: > >> Mick Burrell <[EMAIL PROTECTED]> wrote: >>> I have a site on a server which produces log files in what seems to >>> be plain text (type .log) and compressed (type .gz). Analog handles >>> these just fine but the server only keeps them for three weeks. I'd >>> like to be able to download these on a weekly basis and merge the >>> files to produce a monthly (or longer period) report. I realise I >>> can't just duplicate the files or I'd get repeat entries. >> >> You don't need to merge the logfiles - Analog can read multiple >> logfiles. > > So if I download a file containing data for weeks 1, 2 and 3 and the > next week download the file which by then will contain weeks 2, 3 and > 4, do you mean that Analog will not count double for weeks 2 and 3?
No. If you're downloading overlapping data, then you'd have to remove the overlap yourself. Analog will assume that overlapped data is coming from different web servers, and will not "filter it out". I'm not sure why you would need to do that though - don't you have access to "closed" logfiles? Most web servers "rotate" their logs either on a fixed schedule (hourly, daily weekly or monthly) or when the logfile reaches a fixed size. While you might want to include the "live" logfile in your analysis (especially if the rotation schedule is fairly long), you shouldn't archive these "live" logs, just the ones that have been rotated. >From your initial description, I imagine that the .gz files are the "rotated" logs - they are "done" and don't overlap. You could consider the .log files as "temporary", constantly being updated until the end of the day, when it is gzipped, and a new .log is created. In that situation, you only want to archive the .gz files. That's a fairly common scenario for managing web server logs - I'd be surprised if your server is using .log and .gz files any differently. >> The simplest solution is add the "historical" files to a zip file, >> and just run Analog against this zip file. > > The files are small at present but I'm not sure I understand what you > mean by this. Sorry - I'm no doubt being dim. If you have daily log files, it's often handy to stick a months worth of logs into a zip file, so that you have one file of 20MB, instead of 30 files of 10-15MB each. If you want to archive your log files on a network drive, Analog will run much faster reading a 20MB .zip file over the network than 400MB of plain text logfiles. (Log files will often achieve 20:1 compression). Even if the size of the logs isn't an issue, if the only reason you're keeping them is to run Analog against them, then collecting them in zip files can be a tidy way to maintain archived logs. Aengus +------------------------------------------------------------------------ | TO UNSUBSCRIBE from this list: | http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +------------------------------------------------------------------------

