Hi, I'm a relative newbie to Nutch, and am experimenting with it a little. I'm making heavy use of URL filters to restrict my crawling to a particular set of sites -- but I'd like to do some analysis of the sites that these sites link to which are NOT in my set. Is there a way to get Nutch to tell me what URLs have been excluded via URL filters?
Thanks for your help, Doug Cook -- View this message in context: http://www.nabble.com/Dump-of-filtered-out-URLs--t1594311.html#a4326336 Sent from the Nutch - User forum at Nabble.com.
