Nutch has quite specific requirements, lot's of RAM and disk I/O. Especially 
the fetcher since it produces large quantities of data. You can use Nutch on 
an existing cluster by just running the job file in runtime/deploy.

If you have limited map/reduce capacity it will almost certainly interfere 
with your other jobs, consider using a job scheduler.

> Hello,
> 
> I have a machine that has hadoop + hive installed, and running as a
> single node hadoop cluster. If I were to run nutch on it, how could I
> make nutch use existing hadoop installation, instead of itself's? Or
> is that a good idea even? Maybe I should run it isolated???
> 
> Best Regards,
> C.B.

Reply via email to