Teruhiko Kurosaka wrote:
Can I use MapReduce to run Nutch on a multi CPU system?

Yes.

I want to run the index job on two (or four) CPUs
on a single system.  I'm not trying to distribute the job
over multiple systems.

If the MapReduce is the way to go,
do I just specify config parameters like these:
mapred.tasktracker.tasks.maxiumum=2
mapred.job.tracker=localhost:9001
mapred.reduce.tasks=2 (or 1?)

and
bin/start-all.sh

?

That should work. You'd probably want to set the default number of map tasks to be a multiple of the number of CPUs, and the number of reduce tasks to be exactly the number of cpus.

Don't use start-all.sh, but rather just:

bin/nutch-daemon.sh start tasktracker
bin/nutch-daemon.sh start jobtracker

Must I use NDFS for MapReduce?

No.

Doug

Reply via email to