The problem I'm having is with LiveJournal. They have 6.8 million registered users, with 2.7 million active. We're talking about a robots.txt of hundreds of megabytes. I imagine other blog hosting sites (Blogger, Xanga) have a similar problem.
Also, robots.txt can only handle noindex. What about nofollow? Is nofollow needed? Is the summary/content recommendation sufficient to handle noarchive and nosnippet? Regards, -Nikolas 'Atrus' Coukouma James Robertson wrote: > robots.txt solved this problem a long time ago > > At 11:15 PM 4/19/2005, you wrote: > >> Hi, >> I've recently ended up in argument about what to do with feeds that >> don't want to be reproduced. I e-mailed Dave Winer in the hope of >> getting some information about RSS end of things. That resulted in a >> blog entry with interesting comments [1], and I now know that Creative >> Commons has an RDF schema for describing licensing [2]. >> >> The only common feature I want to include, and haven't found, is the >> "noindex" type of behavior (do not include in search engines). I >> searched the archives of this list and found an old thread discussing >> this very issue [3]. It seems to have fizzled out and I haven't found >> anything more recent documents or discussions. >> >> Was the issue simply forgotten or purposfully dropped? >> >> In the RSS discussion, it was suggested by Roger Benningfield that >> search eninges and syndication sites use atom:summary instead of >> atom:content to avoid the noarchive issue. The rationale is that >> summaries are meant to be reproduced, much like an abstract for a paper. >> >> I'm not sure about nofollow, I think noindex is definitely needed. The >> latter could be used to opt-out of services such as Feedster, >> Technorati, and PubSub. >> >> Thoughts and comments? >> >> [1] http://www.reallysimplesyndication.com/2005/04/19#a445 >> [2] http://web.resource.org/cc/ >> [3] http://www.imc.org/atom-syntax/mail-archive/msg00183.html >> >> Regards, >> -Nikolas 'Atrus' Coukouma > > > <Talk Small and Carry a Big Class Library> > James Robertson, Product Manager, Cincom Smalltalk > http://www.cincomsmalltalk.com/blog/blogView >