Nutch has quite specific requirements, lot's of RAM and disk I/O. Especially the fetcher since it produces large quantities of data. You can use Nutch on an existing cluster by just running the job file in runtime/deploy.
If you have limited map/reduce capacity it will almost certainly interfere with your other jobs, consider using a job scheduler. > Hello, > > I have a machine that has hadoop + hive installed, and running as a > single node hadoop cluster. If I were to run nutch on it, how could I > make nutch use existing hadoop installation, instead of itself's? Or > is that a good idea even? Maybe I should run it isolated??? > > Best Regards, > C.B.

