In message <[EMAIL PROTECTED]>, [EMAIL PROTECTED] writes: >Synopsis: Web robots should be told not to index auto-generated index pages > >State-Changed-From-To: open-closed >State-Changed-By: brian >State-Changed-When: Wed Aug 27 10:47:33 PDT 1997 >State-Changed-Why: >We talked about it on the developers list, and don't necessarily >agree that index pages shouldn't be indexed by robots. If >you want to add custom META tags to your pages, you can set >"IndexOptions SuppressHTMLPreamble", and then put a full HTML <HEAD> >section in HEADER.html in each directory. > >
However, this relies on a majority of web page authors being savvy enough to know about the protocol, get their admin to add the IndexOptions line and to remember to copy HEADER.html into every directory. I think this is at best optimistic. Does anyone really disagree that marking auto-index pages as "noindex,follow" *by default* is not a good idea? This is what my suggestion amounts to, since it could be overridden as you describe. The real problem robots have with the current situation is that (assuming the robot author even appreciates the problem) it is hard to come up with a reliable way to determine if a page is an auto-generated index page. Olly
