> exclude those hosts (both with IP addresses and names) which query the
> server for a "/robots.txt" file.

   Analog doesn't have the capability to "remember" anything about
previous logfile lines. (if it remembered anything it would be slower)

   That would not stop you from doing a 
`grep robots.txt access_log | cut -f1 -d' ' | sort -u > excludeIP`
and then (maybe even in the same pipeline) producing a special
exclude.cnf type file that does HOSTEXCLUDE <x>

   Most "real" robots set an identifiable browser string anyway though, so
you could easily exclue them based on that- assuming you log that data of
course. I think most people will agree this is the best way to get rid of
robots in your reports.

   Personally, when I'm bored I will often request robots.txt just to see
what people are putting in. Sometimes interesting things can be learned.

-=Jim=- 

------------------------------------------------------------------------
This is the analog-help mailing list. To unsubscribe from this
mailing list, send mail to [EMAIL PROTECTED]
with "unsubscribe" in the main BODY OF THE MESSAGE.
List archived at http://www.mail-archive.com/[email protected]/
------------------------------------------------------------------------

Reply via email to