You can use the REFEXCLUDE line to remove all cache entries from your reports if you don't want them at all. Something like this should work:
REFEXCLUDE http://*google*/*q=cache* If you want to keep the requests (because they are valid, even if there appear to be a lot of them) but don't want them to show up in the search word report you could either change them with a BROWALILAS command so they don't match the SEARCHENGINE command or change the Google SEARCHENGINE command to be more particular (I think using a REGEXP, if it's really perl-compatible, you can do a zero-width negative look-ahead assertion to make sure that the query isn't a cache: item.) -- Jeremy Wadsack Wadsack-Allen Digital Group Tobias Stefan Richter ([EMAIL PROTECTED]; Friday, August 22, 2003 8:13 AM): > Hi, > when people browse Google's cached version of a web page of > mine or use the 'what's related' feature, they leave lines > like these in my log file: > 0.129.141.165 - - [15/Aug/2003:01:26:45 +0200] "GET / HTTP/1.1" 200 8630 > "http://www.google.de/search?hl=de&lr=lang_de&ie=UTF-8&oe=UTF-8&q=related:www.physik.tu-berlin.de/~tallera/gb/gaestebuch.phtml" > "Mozilla/4.0" > 0.27.136.10 - - [19/Aug/2003:14:52:46 +0200] "GET / HTTP/1.1" 304 - > "http://www.google.com.tr/search?q=cache:_DSfUGirLE8J:atom.physik.tu-berlin.de/pub2002.0.html+Photoionisation+studies+of+the+2p+resonances+of+atomic+Calcium&hl=tr&ie=UTF-8" > "Mozilla/4.0" > They enter the Search Query Report and the Search Word Report. > Using a quite standard config file with analog 5.23 (also > tried 5.32) I'm not quite happy with the outcome for the > following reaons: > - the related entry should not appear in the Search Word Report > - the cache:_xxx[URL] part should be suppressed completely in > both reports > - unlike real searches, where one sees a single Referer line > per client (for the main document), a "cache:_xxx[URL]" containing > line is seen for every image referenced by the main document > (which itself is not fetched). > by that 'cache' search queries and words apprear exaggeratedly popular > If yet no good idea how to fix this. > At least I doubt it can be easily configured in analog, or did I miss > something? > In the mean time, while the first two issue aren't solved, maybe > it's easier to do that one: For the Search Word Report the URL part > of the 'cache:' or 'related:' expressions is split into two parts > due to the hyphen "-" in it, so I get "cache:_dsfugirle8j:atom.physik.tu" > as a popular search word and "berlin.de/pub2002.0.html" aswell. > Why does analog split at the hyphen at all? > Bye, > Tobias +------------------------------------------------------------------------ | TO UNSUBSCRIBE from this list: | http://lists.isite.net/listgate/analog-help/unsubscribe.html | | Digest version: http://lists.isite.net/listgate/analog-help-digest/ | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general | List archives: http://www.analog.cx/docs/mailing.html#listarchives +------------------------------------------------------------------------
