Is there some place to tell why the crawler has rejected a page? I'm trying to get 1.1 working and basically it doesn't seem to crawl the same way that 1.0 does.
I have tika included in the parse- section of conf/nutch-site.xml I have DEBUG set for all the crawl sections, but it doesn't really say why it's rejecting a site. I have the crawler set to not follow external links and I seed the top level of each site. I'm just unclear on how to proceed to troubleshoot this.