This robots.txt issue was exaggerated by leftist crtitics of the administration. (This is not a general defense of the White House, just a statement of fact.) The Bush WH.gov server has a special Iraq section where press releases, speeches, etc. are reposted in a different HTML template. The WH only wants the "master" copy indexed and not the duplicate copy in the second template. Hence the apparent weirdness in robots.txt.
I have not found any skullduggery going on, though I suppose it wouldn't hurt to keep a copy of the Iraq section for "diff" purposes just in case. -Declan On Wed, Dec 10, 2003 at 02:59:07PM +0200, Anatoly Vorobey wrote: > On Wed, Dec 10, 2003 at 12:56:24PM +0100, Eugen Leitl wrote: > > Can somebody with a webspider crawl these documents, and put it up > > on the web? > > > > http://www.whitehouse.gov/robots.txt > > All or nearly all of them are duplicates of same documents > elsewhere in the directory tree; "X/text/" and "X/iraq/" are > supposed to be copies of "X/", with images removed in the first > case. I suspect that downloading them all would just confirm that. > > -- > avva