Hi,

when people browse Google's cached version of a web page of 
mine or use the 'what's related' feature, they leave lines
like these in my log file:

0.129.141.165 - - [15/Aug/2003:01:26:45 +0200] "GET / HTTP/1.1" 200 8630 
"http://www.google.de/search?hl=de&lr=lang_de&ie=UTF-8&oe=UTF-8&q=related:www.physik.tu-berlin.de/~tallera/gb/gaestebuch.phtml";
 "Mozilla/4.0"

0.27.136.10 - - [19/Aug/2003:14:52:46 +0200] "GET / HTTP/1.1" 304 - 
"http://www.google.com.tr/search?q=cache:_DSfUGirLE8J:atom.physik.tu-berlin.de/pub2002.0.html+Photoionisation+studies+of+the+2p+resonances+of+atomic+Calcium&hl=tr&ie=UTF-8";
 "Mozilla/4.0"

They enter the Search Query Report and the Search Word Report.
Using a quite standard config file with analog 5.23 (also
tried 5.32) I'm not quite happy with the outcome for the 
following reaons:

- the related entry should not appear in the Search Word Report

- the cache:_xxx[URL] part should be suppressed completely in 
  both reports

- unlike real searches, where one sees a single Referer line
  per client (for the main document), a "cache:_xxx[URL]" containing 
  line is seen for every image referenced by the main document
  (which itself is not fetched).
  by that 'cache' search queries and words apprear exaggeratedly popular

If yet no good idea how to fix this. 
At least I doubt it can be easily configured in analog, or did I miss
something?

In the mean time, while the first two issue aren't solved, maybe 
it's easier to do that one: For the Search Word Report the URL part
of the 'cache:' or 'related:' expressions is split into two parts 
due to the hyphen "-" in it, so I get "cache:_dsfugirle8j:atom.physik.tu" 
as a popular search word and "berlin.de/pub2002.0.html" aswell.
Why does analog split at the hyphen at all?

Bye,
Tobias

-- 
Tobias Richter       LabPZ - AG Prof. Dr. P. Zimmermann
TU-Berlin  PN 3-2               Phone: +49-30-314-23010
Hardenbergstr. 36               Fax:   +49-30-314-23018
10623 Berlin Germany    http://atom.physik.tu-berlin.de
+------------------------------------------------------------------------
|  TO UNSUBSCRIBE from this list:
|    http://lists.isite.net/listgate/analog-help/unsubscribe.html
|
|  Digest version: http://lists.isite.net/listgate/analog-help-digest/
|  Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
|  List archives:  http://www.analog.cx/docs/mailing.html#listarchives
+------------------------------------------------------------------------

Reply via email to