I have been trying to learn the Nutch code base by stepping through the code
in debug mode of Eclipse. However I am unable to understand a piece of code
in the Injector.

When I run the crawl command used for intranet crawling, it successfully
injects urls into the database. When I run standalone Injector, on the same
set of urls it injects nothing, returning null from each pass of
PrefixURLFilter.filter( url )

I saw in an achieve that that the crawl command uses crawl-tool.xml for its
config, where otherwise nutch-site.xml is used. So I made the
nutch-site.xmlfile exactly the same, but this seemed to have no
result. Does anyone know
why?

I apologize for the newb question, but any help would be greatly
appreciated.

-Charlie

Reply via email to