Hey Charlie,
What do the logs say in logs/hadoop.log?
You can also try to to set a breakpoint in Eclipse in the map method of
InjectMapper and reduce method of InjectReducer. When you get there in
debug mode, inspect your variables and check if everything looks good.
You can also check if your urls make it through: url =
filters.filter(url); in InjectMapper
HTH,
Renaud
Charlie Williams wrote:
I have been trying to learn the Nutch code base by stepping through
the code
in debug mode of Eclipse. However I am unable to understand a piece of
code
in the Injector.
When I run the crawl command used for intranet crawling, it successfully
injects urls into the database. When I run standalone Injector, on the
same
set of urls it injects nothing, returning null from each pass of
PrefixURLFilter.filter( url )
I saw in an achieve that that the crawl command uses crawl-tool.xml
for its
config, where otherwise nutch-site.xml is used. So I made the
nutch-site.xmlfile exactly the same, but this seemed to have no
result. Does anyone know
why?
I apologize for the newb question, but any help would be greatly
appreciated.
-Charlie
--
Renaud Richardet +1 617 230 9112
my email is my first name at apache.org http://www.oslutions.com