Many thanks for this Stephen. Yes there do seem to be an unusally high nuber of corrupt log lines. The problem seems to be truncation. I don't yet know (haven't had time to check) whether they all terminate at a specific length...possibly so...so maybe it's a server config issue and long URIs causing the log lines to overflow and be truncated rendering them useless. So it's probalby fair to guess a lot of these lines are page reads with long associated URIs that have been truncated. Hence I'm losing page reads.
Re browser activiy and caching, with '304ISSUCCESS ON' I presume a GET request with a 304 will be counted as a page read? Of course, if the browser (or an intermediate proxy) doesn't return the request to the server...) I've not yet got to the bottom of the 'mis-typed' URLs. I've grep-ed out some of the 'file type' patterns but then looking at the result see these are the referrer, not the target URL and so I need to do more to try and find the lines analog is seeing as a specific file type. Thanks for your help.../Iain -----Original Message----- From: analog-help-boun...@lists.meer.net [mailto:analog-help-boun...@lists.meer.net] On Behalf Of Stephen Turner Sent: 21 February 2009 17:17 To: Support for analog web log analyzer Subject: Re: [analog-help] Problem with page counts OK, there are lots of things here, but the first important thing to say is that logfile analysis and page tagging will never match up. They use fundamentally different techniques, and each makes errors that the other is not susceptible to. For page views you would normally expect to see the logfile analysis numbers lower, because page tagging will see the page again if the visitor returns to it, but logfile analysis won't. You do have too many corrupt lines. If you turn debugging on, you will see all the corrupt lines, and where in the line they were corrupt. It looks like you have about 100,000 of these strange ".s=tl" lines, right? Page tagging may be including them as pages, depending what they really are and whether they are tagged, so it may be worth tracking them down in the logfiles. Sorry, no great insights, but at least that might give you some avenues to look down. -- Stephen Turner +----------------------------------------------------------------------- +- | TO UNSUBSCRIBE from this list: | http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html List | archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +----------------------------------------------------------------------- +- +------------------------------------------------------------------------ | TO UNSUBSCRIBE from this list: | http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +------------------------------------------------------------------------