The following reply was made to PR mod_dir/1057; it has been noted by GNATS.
From: Marc Slemko <[EMAIL PROTECTED]> To: Olly Betts <[EMAIL PROTECTED]> Subject: Re: mod_dir/1057: Web robots should be told not to index auto-generated index pages Date: Thu, 28 Aug 1997 10:04:44 -0600 (MDT) On Thu, 28 Aug 1997, Olly Betts wrote: > In message <[EMAIL PROTECTED]>, [EMAIL PROTECTED] writes: > >Synopsis: Web robots should be told not to index auto-generated index pages > > > >State-Changed-From-To: open-closed > >State-Changed-By: brian > >State-Changed-When: Wed Aug 27 10:47:33 PDT 1997 > >State-Changed-Why: > >We talked about it on the developers list, and don't necessarily > >agree that index pages shouldn't be indexed by robots. If > >you want to add custom META tags to your pages, you can set > >"IndexOptions SuppressHTMLPreamble", and then put a full HTML <HEAD> > >section in HEADER.html in each directory. > > > > > > However, this relies on a majority of web page authors being savvy enough to > know about the protocol, get their admin to add the IndexOptions line and to > remember to copy HEADER.html into every directory. I think this is at best > optimistic. > > Does anyone really disagree that marking auto-index pages as > "noindex,follow" *by default* is not a good idea? This is what my > suggestion amounts to, since it could be overridden as you describe. Yes. It is not a good idea. Index pages can have a lot more than a directory index in them. They can have headers, footers, file descriptions, none of which will necessarily appear anywhere else. This probably would be accepted as an IndexOptions setting if a patch were made, but default probably wouldn't be enabled. > > The real problem robots have with the current situation is that (assuming > the robot author even appreciates the problem) it is hard to come up with a > reliable way to determine if a page is an auto-generated index page. > > Olly >
