In message <[EMAIL PROTECTED]>, [EMAIL PROTECTED] writes:
>Synopsis: Web robots should be told not to index auto-generated index pages
>
>State-Changed-From-To: open-closed
>State-Changed-By: brian
>State-Changed-When: Wed Aug 27 10:47:33 PDT 1997
>State-Changed-Why:
>We talked about it on the developers list, and don't necessarily
>agree that index pages shouldn't be indexed by robots.  If
>you want to add custom META tags to your pages, you can set
>"IndexOptions SuppressHTMLPreamble", and then put a full HTML <HEAD>
>section in HEADER.html in each directory.
>
>

However, this relies on a majority of web page authors being savvy enough to
know about the protocol, get their admin to add the IndexOptions line and to
remember to copy HEADER.html into every directory.  I think this is at best
optimistic.

Does anyone really disagree that marking auto-index pages as
"noindex,follow" *by default* is not a good idea?  This is what my
suggestion amounts to, since it could be overridden as you describe.

The real problem robots have with the current situation is that (assuming
the robot author even appreciates the problem) it is hard to come up with a
reliable way to determine if a page is an auto-generated index page.

Olly

Reply via email to