Well, you could set a fake user agent.
> As I crawl more websites I finding I'm encountering more and more websites > that reject the crawl by basically redirecting the crawl to an HTML page > that that states something along the lines of: > > HTTP 602 Unsupported Browser The browser you are using (XYZ Spider/0.1 beta > (xyz.com search engine; http://www.xyz.com)) > > or > > Sorry, but you either have JavaScript turned off or a JavaScript > incompatible browser > > Or > > Unsupported Browser > Browser type and version Generic crawler 0.1 > Browser build Platform Unknown > Cookies supported False > Cookies enabled Disabled > JavaScript supported False > JavaScript enabled False > ActiveX enabled False > VBScript enabled False > Java applets supported False > Etc... > > > Lots of different messages come back, but basically it is rejecting a crawl > of the website because of browser incompatibility. > > Do I have Nutch configured incorrectly? > Is there a way to crawl these sites? > Recommendations? > > > Thanks > Brad

