I've found it! It's a bug/conversion error in the Robots file I got from
Jeremy's site.

It appears that any line which has a comma in it is interpreted by Analog
as a delimited line, having two separate ROBOTINCLUDE entries. Hence the
following lines:

ROBOTINCLUDE "HKU WWW Robot,*"
ROBOTINCLUDE "Hazel's Ferret Web hopper,*"
ROBOTINCLUDE IBM_Planetwide,*
ROBOTINCLUDE JoeBot/x.x,*
ROBOTINCLUDE "Openfind data gatherer, Openbot/*"

all get interpreted by Analog (on Windoze XP) as separate robot include
entries, thus (from the SETTINGS ON dump):

  + HKU WWW Robot
  + *
  + Hazel's Ferret Web hopper
  + *
  + IBM_Planetwide
  + *
  + JoeBot/x.x
  + *
  + Openfind data gatherer
  +  Openbot/*

The wildcard entries, therefore, match EVERY User-Agent found, as being a
robot.

By way of correction, I have simply dropped the comma (because the wildcard
* immediately following will pick it up anyway, I'm assuming) on all the
errant lines bar the last one.

However, I'm not quite sure what to do with the Openfind one - is there a
way of escape-prefixing the comma, which is more advisable? I'd like to
test the string as nearly as the User-Agent reports (because I am a geek -
lol)

And remember I said I'd actually tested this by removing the CONFIGFILE
line that loaded up my Robots file, and the OS Report still failed? Well,
by rights, you'd think this would've been fixed at that point - except, it
turns out, for some strange reason, my SETTINGS ON dump log shows TWO
inclusions of the Robot file's data! All the items prefixed with + above,
are actually listed twice in the dump log. Is this because of caching, or
what? I haven't altered any caches manually- I simply edited my Robots file
to fix these commas, and the very next time, everything worked.

I don't quite understand what's going on there (ie, why does it do 2 reads,
if it's not caching, and why did it list them twice - the robots file was
very definitely only being loaded once!).

But I'm sure glad to have fixed it, in any event!

Thanks for the suggestions, Jeremy - they did the trick ;-)

Regards
Neil


+------------------------------------------------------------------------
|  TO UNSUBSCRIBE from this list:
|    http://lists.isite.net/listgate/analog-help/unsubscribe.html
|
|  Digest version: http://lists.isite.net/listgate/analog-help-digest/
|  Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
|  List archives:  http://www.analog.cx/docs/mailing.html#listarchives
+------------------------------------------------------------------------

Reply via email to