As I crawl more websites I finding I'm encountering more and more websites that reject the crawl by basically redirecting the crawl to an HTML page that that states something along the lines of:
HTTP 602 Unsupported Browser The browser you are using (XYZ Spider/0.1 beta (xyz.com search engine; http://www.xyz.com)) or Sorry, but you either have JavaScript turned off or a JavaScript incompatible browser Or Unsupported Browser Browser type and version Generic crawler 0.1 Browser build Platform Unknown Cookies supported False Cookies enabled Disabled JavaScript supported False JavaScript enabled False ActiveX enabled False VBScript enabled False Java applets supported False Etc... Lots of different messages come back, but basically it is rejecting a crawl of the website because of browser incompatibility. Do I have Nutch configured incorrectly? Is there a way to crawl these sites? Recommendations? Thanks Brad

