[analog-help] Very large logfiles
I apologize if this shows up on list twice. I posted to the newsgroup yesterday, but it never posted. First of all I'd like to say that Analog is the fastest log analysis software I have ever seen. On a fast intel it chews through our ~12M lines of apache logs in about 2 minutes. Great software :) Now for my problem. We process ~3GB of logs daily for our main domain. The management likes to see cumulative numbers from day to day. So, what I'm doing is processing the log files and making a computer output file for Report Magic, as well as a cachefile. Then the next day, when my scripts run, they move the CACHEOUTFILE from the day before to the CACHEFILE filename, and proless the logfiles and the CACHEFILE together to create a cumulative report. The box I'm doing this on has 4GB of ram, but it's still using it all and blowing out with a Ran out of memory error. I read the docs on cachefiles and low memory usage, but even with HOSTLOWMEM 3 and a FILEALIAS for our commonly accessed filenames, I'm still running out of memory after about 2 days worth of data. Am I doing something wrong? I see people that have over a years worth of data in their reports, but at this rate, I'll not be able to get a week. Can anyone offer some advice? Sam + | TO UNSUBSCRIBE from this list: |http://lists.isite.net/listgate/analog-help/unsubscribe.html | | Digest version: http://lists.isite.net/listgate/analog-help-digest/ | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general | List archives: http://www.analog.cx/docs/mailing.html#listarchives +
Re: [analog-help] Very large logfiles
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi- This is just a stab in the dark, but do you have the referrer report turned on? That will use up a lot of memory. - --Quentin -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.3 (GNU/Linux) iD8DBQFAjpD11ePQTjeBqRARAv59AKDVPBHW+/N2QefKfepD6iEuAZzGbgCfcMQJ /yivVDlfnTMl+KjfCxFXiJk= =P56Y -END PGP SIGNATURE- + | TO UNSUBSCRIBE from this list: |http://lists.isite.net/listgate/analog-help/unsubscribe.html | | Digest version: http://lists.isite.net/listgate/analog-help-digest/ | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general | List archives: http://www.analog.cx/docs/mailing.html#listarchives +
Re: [analog-help] Very large logfiles
Greetings, Sam. One solution to dealing with large log files is to break down the report into multiple reports, using aggressive ALIAS and LOGFORMAT techniques. On a custom system I set up, a suite of reports runs with the aid of a Perl helper program once per hour. On the raw traffic report, all page requests are aliased to the same filename, somepage.someformat, using FILEALIAS. A subsequent FILEINCLUDE command passes only that one filename and filters everything else. For that report, almost all the fields in the log file are defined as junk, using %j -- the only ones kept are the requested file %r and the time/date fields %d %M %Y %h %n. The resulting cachefiles are quite small, and so are the demands upon the box doing the number crunching. 16 reports update in under a minute. The first time I run a report, it takes a lot longer to crunch through all the logs, write all the cachefiles and complete the report, but that only has to happen once. Each report contains far less than you get in a default analog config, but since the default config chokes when we feed it our giganto-logfiles, divide and conquer seems to be the best bet. I recommend that you peep a cachefile and see what's taking up a lot of space. If it's data you can't live without, break that bit out into a separate report, with its own set of cachefiles. Cheers, -- Marvin Humphrey On Apr 27, 2004, at 8:51 AM, Samuel Kesterson wrote: I apologize if this shows up on list twice. I posted to the newsgroup yesterday, but it never posted. First of all I'd like to say that Analog is the fastest log analysis software I have ever seen. On a fast intel it chews through our ~12M lines of apache logs in about 2 minutes. Great software :) Now for my problem. We process ~3GB of logs daily for our main domain. The management likes to see cumulative numbers from day to day. So, what I'm doing is processing the log files and making a computer output file for Report Magic, as well as a cachefile. Then the next day, when my scripts run, they move the CACHEOUTFILE from the day before to the CACHEFILE filename, and proless the logfiles and the CACHEFILE together to create a cumulative report. The box I'm doing this on has 4GB of ram, but it's still using it all and blowing out with a Ran out of memory error. I read the docs on cachefiles and low memory usage, but even with HOSTLOWMEM 3 and a FILEALIAS for our commonly accessed filenames, I'm still running out of memory after about 2 days worth of data. Am I doing something wrong? I see people that have over a years worth of data in their reports, but at this rate, I'll not be able to get a week. Can anyone offer some advice? Sam +-- -- | TO UNSUBSCRIBE from this list: |http://lists.isite.net/listgate/analog-help/unsubscribe.html | | Digest version: http://lists.isite.net/listgate/analog-help-digest/ | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general | List archives: http://www.analog.cx/docs/mailing.html#listarchives +-- -- + | TO UNSUBSCRIBE from this list: |http://lists.isite.net/listgate/analog-help/unsubscribe.html | | Digest version: http://lists.isite.net/listgate/analog-help-digest/ | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general | List archives: http://www.analog.cx/docs/mailing.html#listarchives +
[analog-help] Analog not reading beyond certain point in log files.
Last week I turned on referrer and browser logging in my Apache httpd.conf. Apache read the log files up until that point, but refuses to read anything after. I have tried setting the LOGFORMAT variable to COMBINED but have had no luck in getting it to work. Because of this, it will not output browser, platform or OS reporting. Has anyone else experienced this? I would be very grateful for any pointers. Colin + | TO UNSUBSCRIBE from this list: |http://lists.isite.net/listgate/analog-help/unsubscribe.html | | Digest version: http://lists.isite.net/listgate/analog-help-digest/ | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general | List archives: http://www.analog.cx/docs/mailing.html#listarchives +
Re: [analog-help] Analog not reading beyond certain point in log files.
This issue has been discussed on this list in the past, so I looked in the list archives and found that specifying multiple logformats is acceptable and that Analog will use the first matching format. To do this, specify logformats first, then specify the logfile (like this). APACHELOGFORMAT (%h %l %u %t %v \%r\ %s %b) APACHELOGFORMAT (%h %l %u %t \%r\ %s %b \%{Referer}i\ \%{User-Agent}i\) LOGFILE /var/log/apache/access.log The example above might not have the exact fields that you need. See http://www.analog.cx/docs/logfmt.html; for a list of fields. By the way, when using Apache, the APACHELOGFORMAT command often produces better results than the LOGFORMAT. I viewed the following archived messages: http://www.mail-archive.com/[EMAIL PROTECTED]/msg03282.html http://www.mail-archive.com/[EMAIL PROTECTED]/msg11625.html http://www.mail-archive.com/[EMAIL PROTECTED]/msg13476.html HTH, -- Duke kn0wledge wrote: Last week I turned on referrer and browser logging in my Apache httpd.conf. Apache read the log files up until that point, but refuses to read anything after. I have tried setting the LOGFORMAT variable to COMBINED but have had no luck in getting it to work. Because of this, it will not output browser, platform or OS reporting. Has anyone else experienced this? I would be very grateful for any pointers. Colin + | TO UNSUBSCRIBE from this list: |http://lists.isite.net/listgate/analog-help/unsubscribe.html | | Digest version: http://lists.isite.net/listgate/analog-help-digest/ | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general | List archives: http://www.analog.cx/docs/mailing.html#listarchives +
Re: [analog-help] PDF
See http://analog.cx/docs/faq.html#faq143;. HTH, -- Duke Luis Mercado wrote: How can I know how many people saw a PDF file, and How many download it? I read that a single PDF can score many hits? Is there a way to control it? Thanks, Luis. + | TO UNSUBSCRIBE from this list: |http://lists.isite.net/listgate/analog-help/unsubscribe.html | | Digest version: http://lists.isite.net/listgate/analog-help-digest/ | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general | List archives: http://www.analog.cx/docs/mailing.html#listarchives + + | TO UNSUBSCRIBE from this list: |http://lists.isite.net/listgate/analog-help/unsubscribe.html | | Digest version: http://lists.isite.net/listgate/analog-help-digest/ | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general | List archives: http://www.analog.cx/docs/mailing.html#listarchives +