Re: [analog-help] Stupidest Ques. Award Goes To...
On Tue, 4 Apr 2000, Kevin Hemenway wrote: Our current analyzer stores everything in a tab delimited flatfile - it doesn't remember referrer's, it doesn't remember much of anything besides hits and total bytes for hits. And that's just fine. This is pretty much what analog's cache file does. It stores the total number of hits for each item, but doesn't cross-reference (say) files and referrers. Ok. So, how would I go about this? One statement worries me: "A couple of other minor points: the pattern of failed requests and redirected requests over time is not recorded in the cache file. So although the total number will still be correct, the number in the last 7 days can be under-reported subsequently. And times are only recorded to five-minute resolution." One of my methods of madness is to take a look at the weekly logfile, find out how many accesses happened that week and add that to a general total on the home page of my site (Disobey.com). This statement worries me in that I'm very vain when it comes to total hits (over 7 million now). Would this ruin that vanity? [...] Uh. Why would Analog look at the historic cache files to determine the hits for the brand-new last seven days report? The only issue is when the historical cache file overlaps the last seven days. Does that answer your question? Ignoring that question, where are the cache files created? Anywhere you want. directory called /usage - have analog send its html reports there, and keep the cache reports in /usage/cache? And then each week, Analog would read from /usage/cache plus the new weekly log file to generate a new report under /usage? Would this new report under usage, because of the cache have total hits and monthly stuff based on how back the cache goes? I guess you've got two main choices. The one I recommend in the docs is to create a cache file from each logfile: then when you want a report, analyse all the cache files. The other would be to create a cumulative cache file each week based on the last cache file and the new week's logfile. Personally I think this second procedure is much more likely to get confused, and when it does, all the data is corrupted together. :) Again, thanks for explaining all this to an Analog newbie. I hope I'm not embarrassing the mailing list all that much ;) Not at all. I wish all the questions here were this intelligent. :) -- Stephen Turner http://www.statslab.cam.ac.uk/~sret1/ Statistical Laboratory, 16 Mill Lane, Cambridge CB2 1SB, England "8th March 2000. National No Smoking Day. Ash Wednesday." (On a calendar) This is the analog-help mailing list. To unsubscribe from this mailing list, send mail to [EMAIL PROTECTED] with "unsubscribe" in the main BODY OF THE MESSAGE. List archived at http://www.mail-archive.com/analog-help@lists.isite.net/
Re: [analog-help] Stupidest Ques. Award Goes To...
On Wed, 5 Apr 2000, Kevin Hemenway wrote: Ok. So, this should never come into effect if I just set up a weekly log report that adds a new week to all the old cache files and reports? I have no plans on changing the reporting frequency. How would this come into effect though? When would the historical cache file (HCF) suck into that seven days? I can see that happening if the machine's date is set wrong, and the log file entries are subsequently dated wrong. I can see that happening if someone wanted to do the last ten days (from/to), but that's explained in the docs. Is there any time where an innocent nonesuch can cause the HCF to overlap? I'm just overly paranoid because the statement seemed so strong in the docs. I really don't think it's a big issue. It's only the number of failed and redirected requests in the last 7 days which goes wrong, not the successful requests. Definitely makes sense. Is there any significant speed decrease when opening up VIRTDOMAINS x WEEKS x YEARS cache files per week as opposed to VIRTDOMAINS x YEARS or VIRTDOMAINS? Probably not much. I can see Analog doing 1994 - 2000 on your machine, which is nice - how long does that take? 18 minutes on a 266 chip, but I don't use cache files. It would be MUCH quicker if I did. Do you have any system load readouts? At a guess, something like this: 1 | ## | ## | ## | ## | ## 0 ### 0:02am 0:20am :) -- Stephen Turner http://www.statslab.cam.ac.uk/~sret1/ Statistical Laboratory, 16 Mill Lane, Cambridge CB2 1SB, England "8th March 2000. National No Smoking Day. Ash Wednesday." (On a calendar) This is the analog-help mailing list. To unsubscribe from this mailing list, send mail to [EMAIL PROTECTED] with "unsubscribe" in the main BODY OF THE MESSAGE. List archived at http://www.mail-archive.com/analog-help@lists.isite.net/
Re: [analog-help] Stupidest Ques. Award Goes To...
On Tue, 4 Apr 2000, Jeremy Wadsack wrote: c) adding a new subdirectory to specifically watch. Again, does not affect the cache files. As long as it wasn't done with a FILEINCLUDE/FILEEXCLUDE when creating the cache files. -- Stephen Turner http://www.statslab.cam.ac.uk/~sret1/ Statistical Laboratory, 16 Mill Lane, Cambridge CB2 1SB, England "8th March 2000. National No Smoking Day. Ash Wednesday." (On a calendar) This is the analog-help mailing list. To unsubscribe from this mailing list, send mail to [EMAIL PROTECTED] with "unsubscribe" in the main BODY OF THE MESSAGE. List archived at http://www.mail-archive.com/analog-help@lists.isite.net/
Re: [analog-help] Stupidest Ques. Award Goes To...
[...] Uh. Why would Analog look at the historic cache files to determine the hits for the brand-new last seven days report? The only issue is when the historical cache file overlaps the last seven days. Does that answer your question? Ok. So, this should never come into effect if I just set up a weekly log report that adds a new week to all the old cache files and reports? I have no plans on changing the reporting frequency. How would this come into effect though? When would the historical cache file (HCF) suck into that seven days? I can see that happening if the machine's date is set wrong, and the log file entries are subsequently dated wrong. I can see that happening if someone wanted to do the last ten days (from/to), but that's explained in the docs. Is there any time where an innocent nonesuch can cause the HCF to overlap? I'm just overly paranoid because the statement seemed so strong in the docs. I guess you've got two main choices. The one I recommend in the docs is to create a cache file from each logfile: then when you want a report, analyse all the cache files. The other would be to create a cumulative cache file each week based on the last cache file and the new week's logfile. Personally I think this second procedure is much more likely to get confused, and when it does, all the data is corrupted together. :) Definitely makes sense. Is there any significant speed decrease when opening up VIRTDOMAINS x WEEKS x YEARS cache files per week as opposed to VIRTDOMAINS x YEARS or VIRTDOMAINS? I can see Analog doing 1994 - 2000 on your machine, which is nice - how long does that take? Do you have any system load readouts? (Perhaps a MRTG chart showing load and duration?). I'm being especially paranoid, as you can see. The primary, and quickly explained reason is: a) old, free log program had y2k issue, b) personally i like analog's report/cust. better, c) boss is annoyed at y2k issue with old, free log program d) boss doesn't want to pay for new, old free log program g... e) now is perfect time to strike with analog... f) ... but everything has to be perfect ;) Again, thanks for explaining all this to an Analog newbie. I hope I'm not embarrassing the mailing list all that much ;) Not at all. I wish all the questions here were this intelligent. :) Whoo hoo! ;) Kevin Hemenway -- - Total Net NH, LLC EMAIL: [EMAIL PROTECTED] 15 Pleasant St., Suite 11 WEBSITE: http://www.totalnetnh.net/ Concord, NH 03301 PHONE: (603) 225-8422 This is the analog-help mailing list. To unsubscribe from this mailing list, send mail to [EMAIL PROTECTED] with "unsubscribe" in the main BODY OF THE MESSAGE. List archived at http://www.mail-archive.com/analog-help@lists.isite.net/
Re: [analog-help] Stupidest Ques. Award Goes To...
I don't see why. Log files compress *very* nicely (typically 95%-98% compression ratios at maximum compression), and Analog has no problem with compressed log files. Disk space is cheap nowadays, so why would keeping six years of log files be crazy? I have Analog running on three years' worth, and I'm sure there's people on this list who've got that beat. Quite true, quite true, they do, but that's simply don't an option I'm looking at in this case. Yes, we have more than enough space in there to handle that. What I don't want to do is have 120 virtual domains, with 3 years of log files each, being reanalyzed every week. I don't care how good Analog is - that's just not a server load, or a process I would like to see happening. Nothing against Analog, of course. So, scratch the idea of keeping the log files. Do I have any other option? Kevin Hemenway -- - Total Net NH, LLC EMAIL: [EMAIL PROTECTED] 15 Pleasant St., Suite 11 WEBSITE: http://www.totalnetnh.net/ Concord, NH 03301 PHONE: (603) 225-8422 This is the analog-help mailing list. To unsubscribe from this mailing list, send mail to [EMAIL PROTECTED] with "unsubscribe" in the main BODY OF THE MESSAGE. List archived at http://www.mail-archive.com/analog-help@lists.isite.net/
Re: [analog-help] Stupidest Ques. Award Goes To...
Kevin Hemenway wrote: So, scratch the idea of keeping the log files. Do I have any other option? Cache files. See http://www.analog.cx/docs/cache.html. As long as you won't later be changing the data you want to report on from the past, this will work wonderfully. They reduce the amount of storage and memory usage needed by Analog and can be compressed in disk as well. And if I did change the format? The data itself wouldn't change - it'd be straight log files from Apache 1.3.9. However, stuff that may change in the future: a) addition of new logs (ie, a referrer log, or error log report). b) adding or removing a report from view c) adding a new subdirectory to specifically watch. I have no intention of messing with inclusions or exclusions. What would happen if I changed something that was different with the cache files, and Analog still ran? Corrupted data? Ignored cache files? New data only for the new cache files, and old data still displayed from the old cache files? Nothing? Kevin Hemenway -- - Total Net NH, LLC EMAIL: [EMAIL PROTECTED] 15 Pleasant St., Suite 11 WEBSITE: http://www.totalnetnh.net/ Concord, NH 03301 PHONE: (603) 225-8422 This is the analog-help mailing list. To unsubscribe from this mailing list, send mail to [EMAIL PROTECTED] with "unsubscribe" in the main BODY OF THE MESSAGE. List archived at http://www.mail-archive.com/analog-help@lists.isite.net/
Re: [analog-help] Stupidest Ques. Award Goes To...
Kevin Hemenway wrote: So, scratch the idea of keeping the log files. Do I have any other option? Cache files. See http://www.analog.cx/docs/cache.html. As long as you won't later be changing the data you want to report on from the past, this will work wonderfully. They reduce the amount of storage and memory usage needed by Analog and can be compressed in disk as well. HTH, Jeremy Wadsack Wadsack-Allen Digital Group This is the analog-help mailing list. To unsubscribe from this mailing list, send mail to [EMAIL PROTECTED] with "unsubscribe" in the main BODY OF THE MESSAGE. List archived at http://www.mail-archive.com/analog-help@lists.isite.net/
Re: [analog-help] Stupidest Ques. Award Goes To...
Kevin Hemenway wrote: Kevin Hemenway wrote: So, scratch the idea of keeping the log files. Do I have any other option? Cache files. See http://www.analog.cx/docs/cache.html. As long as you won't later be changing the data you want to report on from the past, this will work wonderfully. They reduce the amount of storage and memory usage needed by Analog and can be compressed in disk as well. And if I did change the format? The data itself wouldn't change - it'd be straight log files from Apache 1.3.9. However, stuff that may change in the future: a) addition of new logs (ie, a referrer log, or error log report). Then referrer reports will only go back as far as the data does. Analog doesn't process error reports (they're meant for human consumption). b) adding or removing a report from view Does not affect the cache files. c) adding a new subdirectory to specifically watch. Again, does not affect the cache files. I have no intention of messing with inclusions or exclusions. What would happen if I changed something that was different with the cache files, and Analog still ran? Corrupted data? Ignored cache files? New data only for the new cache files, and old data still displayed from the old cache files? Nothing? The only things that affect cache files are inclusions and exclusions (including those implied by time commands [FROM and TO] and *LOWMEM commands and those created by changes in logformat). If the data in a cache file is changed, Analog would still run, but some values may be erroneous (e.g. total count for previously excluded file, host, browser, etc) and some reports may not have data as far back as others (e.g. adding referrer data). HTH, Jeremy Wadsack Wadsack-Allen Digital Group This is the analog-help mailing list. To unsubscribe from this mailing list, send mail to [EMAIL PROTECTED] with "unsubscribe" in the main BODY OF THE MESSAGE. List archived at http://www.mail-archive.com/analog-help@lists.isite.net/
Re: [analog-help] Stupidest Ques. Award Goes To...
a) They can't keep the log files right? That'd be crazy? I don't see why. Log files compress *very* nicely (typically 95%-98% compression ratios at maximum compression), and Analog has no problem with compressed log files. Disk space is cheap nowadays, so why would keeping six years of log files be crazy? I have Analog running on three years' worth, and I'm sure there's people on this list who've got that beat. This is the analog-help mailing list. To unsubscribe from this mailing list, send mail to [EMAIL PROTECTED] with "unsubscribe" in the main BODY OF THE MESSAGE. List archived at http://www.mail-archive.com/analog-help@lists.isite.net/
Re: [analog-help] Stupidest Ques. Award Goes To...
Kevin Hemenway wrote: My problem is thus: I see sites (line Analog's) that have usage stats from 1994 to the year 2000. And my question is: how is that possible? Questions running through my head: a) They can't keep the log files right? That'd be crazy? Depends on how busy your site is. If you get 10MB of logs per day, you can easily store 5 years of uncompressed log files on a $200 hard drive. With compression, you could probably increase that by a factor of 5-10. (How much memory you'd need to analyse this, though, I have no idea - I've run a couple of reports on about 3G of log files on a system with 250Mg of RAM). b) Is it done with cache files? Cache files from 1994 to the year 2000? It could be, though you loose some detail with cache files, of course. In many cases, this loss of detail doesn't really matter, especially for data that's 3 years old. c) How does Analog, in March of 2000, know about the reports from 1994, to generate a new monthly report with info from 1994 to the year 2000? ?? How does Analog know about anything? You tell it where to find the information. You can tell Analog to include as many logfiles in a single run as you want. You can say LOGFILE 1998*.LOG 1999*.LOG 2000*.LOG if you want d) Does it have anything to do with OUTPUT COMPUTER? No. Read docs/cache.html And if it does, does that mean that Analog would have to be run twice for each config file? One to generate the COMPUTER file (this COMPUTER file - how would it remember all the other reports from 1994 onward? or does it?), and the one to analyze the COMPUTER file and report everything? Analog will create an output file and a cache file at the same time, if you tell it to. But HTML and COMPUTER are two different OUTPUT types, so you can only get one of them per run. CACHE isn't an OUTPUT type. As you can see, I have no clue. And I'm adverse to installing it until I can see it working on paper. You'd have had it installed, and run some sample reports in the time it took you to write this note. Basically, this is what I want: a) Reports generated every week. b) A monthly report. c) An overall report. Nothin' to it... We've got about 120 virtual domains that would all follow one format (with separate reports for each virt.) and then one guy (me) whose annoying and likes to tweak everything to death. The number of different ways you can tweak Analog reports is absolutely stunning. Should keep you satisfied for quite some time! This is the analog-help mailing list. To unsubscribe from this mailing list, send mail to [EMAIL PROTECTED] with "unsubscribe" in the main BODY OF THE MESSAGE. List archived at http://www.mail-archive.com/analog-help@lists.isite.net/