James Robertson wrote: > At 11:38 PM 4/19/2005, you wrote: > >> The problem I'm having is with LiveJournal. They have 6.8 million >> registered users, with 2.7 million active. We're talking about a >> robots.txt of hundreds of megabytes. I imagine other blog hosting sites >> (Blogger, Xanga) have a similar problem. > > > Why would it be that big? Why wouldn't it just be a site wide policy?
We offer users a choice. There's a "bot blocking" preference to address exactly this issue on LiveJournal: some users want to be in search engines, others not. The discussion two years ago mentioned the figure 25% and I'm not sure how accurate that is. When users have domains or subdomains (a paid account feature), then we do produce an appropriate robots.txt file. However, only 1.5% of users have that option available to them. I can understand if Atom doesn't want to bother with a small number of sites that have a large number of users. They are few and far between. >> Also, robots.txt can only handle noindex. What about nofollow? Is >> nofollow needed? Is the summary/content recommendation sufficient to >> handle noarchive and nosnippet? > > > well, IMHO, nofollow is pointless. I'm not a fan myself. > The rest of it? Which aggregators pay any attention? I was under the impression that this mailing list was involved in creating a standard that could then be implemented by developers. I've already mentioned JournURL's policy (from Roger Benningfield). LiveJournal has little policy in effect and it's something we are (or at least I am) trying to figure out. LiveJournal both produces feeds (RSS and Atom feeds are produced for all users) and consumes them (it acts as an aggregator via "syndication accounts"). I intend to contact some of the other large sites, but wanted to know if something already existed. I don't want re-invent the wheel and have already done a fair amount of searching for existing ones. I also don't want to just fabricate an element on my own when there's a mailing list for discussing such things. -Nikolas 'Atrus' Coukouma