Tom > [quote] > Finally, a question that exposes the worst flaw of the robots.txt protocol: > a webmaster wishes to make all pages of a Web site, EXCEPT the home page > (i.e. "/"), accessible to robots; how can she do this using the robots.txt > protocol? The answer - "She can't". > [unquote] > > This is nonsense, what's happening here is that the Webmaster doesn't > understand websites and has failed to distinguish between the default page > and the home page. If she wants to allow all pages except the homepage, she > can write a disallow line that explicitly excludes the home page using it's > full path. She can then have the default page be a client-side redirection > to the home page, so that users other than robots will get the impression > that the home page is the default page.
The '(i.e. "/")' in the question was supposed to define the term "home page" more precisely - it's what you call the default page. So, to put the question in your terms: A webmaster wishes to make all pages of a Web site, EXCEPT the default page (i.e. "/"), accessible to robots; how can she do this using the robots.txt protocol? With regards to your proposed solution, is the onus on Webmasters to understand robots, or on robots to understand Webmasters? In the light of recent legal cases, I think it may be the latter... There are problems with robots.txt and the robots meta tag. We all know it. I think there *could* be one standard that addressed the following issues: link rot, crawling, caching/duplication, indexing and the law. That standard would be a worthy replacement to the current standards. Regards Alan Perkins, e-Brand Management Limited http://www.ebrandmanagement.com/ White Paper in question: http://www.ebrandmanagement.com/whitepapers/