I have been trying to learn the Nutch code base by stepping through the code in debug mode of Eclipse. However I am unable to understand a piece of code in the Injector.
When I run the crawl command used for intranet crawling, it successfully injects urls into the database. When I run standalone Injector, on the same set of urls it injects nothing, returning null from each pass of PrefixURLFilter.filter( url ) I saw in an achieve that that the crawl command uses crawl-tool.xml for its config, where otherwise nutch-site.xml is used. So I made the nutch-site.xmlfile exactly the same, but this seemed to have no result. Does anyone know why? I apologize for the newb question, but any help would be greatly appreciated. -Charlie
