Hi, when people browse Google's cached version of a web page of mine or use the 'what's related' feature, they leave lines like these in my log file:
0.129.141.165 - - [15/Aug/2003:01:26:45 +0200] "GET / HTTP/1.1" 200 8630 "http://www.google.de/search?hl=de&lr=lang_de&ie=UTF-8&oe=UTF-8&q=related:www.physik.tu-berlin.de/~tallera/gb/gaestebuch.phtml" "Mozilla/4.0" 0.27.136.10 - - [19/Aug/2003:14:52:46 +0200] "GET / HTTP/1.1" 304 - "http://www.google.com.tr/search?q=cache:_DSfUGirLE8J:atom.physik.tu-berlin.de/pub2002.0.html+Photoionisation+studies+of+the+2p+resonances+of+atomic+Calcium&hl=tr&ie=UTF-8" "Mozilla/4.0" They enter the Search Query Report and the Search Word Report. Using a quite standard config file with analog 5.23 (also tried 5.32) I'm not quite happy with the outcome for the following reaons: - the related entry should not appear in the Search Word Report - the cache:_xxx[URL] part should be suppressed completely in both reports - unlike real searches, where one sees a single Referer line per client (for the main document), a "cache:_xxx[URL]" containing line is seen for every image referenced by the main document (which itself is not fetched). by that 'cache' search queries and words apprear exaggeratedly popular If yet no good idea how to fix this. At least I doubt it can be easily configured in analog, or did I miss something? In the mean time, while the first two issue aren't solved, maybe it's easier to do that one: For the Search Word Report the URL part of the 'cache:' or 'related:' expressions is split into two parts due to the hyphen "-" in it, so I get "cache:_dsfugirle8j:atom.physik.tu" as a popular search word and "berlin.de/pub2002.0.html" aswell. Why does analog split at the hyphen at all? Bye, Tobias -- Tobias Richter LabPZ - AG Prof. Dr. P. Zimmermann TU-Berlin PN 3-2 Phone: +49-30-314-23010 Hardenbergstr. 36 Fax: +49-30-314-23018 10623 Berlin Germany http://atom.physik.tu-berlin.de +------------------------------------------------------------------------ | TO UNSUBSCRIBE from this list: | http://lists.isite.net/listgate/analog-help/unsubscribe.html | | Digest version: http://lists.isite.net/listgate/analog-help-digest/ | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general | List archives: http://www.analog.cx/docs/mailing.html#listarchives +------------------------------------------------------------------------
