Stefan Groschupf wrote:

I was trying the Grep job, however it fails since nutch.jar was not found.

Did you run the Grep.main using bin/nutch? That should do the trick:

bin/nutch org.apache.nutch.mapReduce.demo.Grep <in> <out> <re> [<group>]

In and out are directories of files. Re is the regex. Group is the optional group within the regex to select when mapping.

Note that you should also define "mapred.job.tracker" in nutch-site.xml with the host:port of your job tracker. (If you're using ndfs then you should also probably define "fs.default.name".)

Doug

Reply via email to