Hey Charlie,

What do the logs say in logs/hadoop.log?

You can also try to to set a breakpoint in Eclipse in the map method of InjectMapper and reduce method of InjectReducer. When you get there in debug mode, inspect your variables and check if everything looks good. You can also check if your urls make it through: url = filters.filter(url); in InjectMapper

HTH,
Renaud


Charlie Williams wrote:
I have been trying to learn the Nutch code base by stepping through the code in debug mode of Eclipse. However I am unable to understand a piece of code
in the Injector.

When I run the crawl command used for intranet crawling, it successfully
injects urls into the database. When I run standalone Injector, on the same
set of urls it injects nothing, returning null from each pass of
PrefixURLFilter.filter( url )

I saw in an achieve that that the crawl command uses crawl-tool.xml for its
config, where otherwise nutch-site.xml is used. So I made the
nutch-site.xmlfile exactly the same, but this seemed to have no
result. Does anyone know
why?

I apologize for the newb question, but any help would be greatly
appreciated.

-Charlie



--
Renaud Richardet                                      +1 617 230 9112
my email is my first name at apache.org      http://www.oslutions.com

Reply via email to