Just to add to what Markus said : see http://wiki.apache.org/nutch/NutchHadoopSingleNodeTutorial The approach is the same for 2.x. Nutch is just a Hadoop application with a few scripts to make your life easier
Julien On 13 November 2013 09:45, Markus Jelsma <[email protected]> wrote: > You can just install Hadoop on the cluster as you would have otherwise. > Then you can run the Nutch job file via the bin/nutch script on any Hadoop > client such as the jobtracker for example. > > > > -----Original message----- > > From:flo @ <[email protected]> > > Sent: Wednesday 13th November 2013 10:20 > > To: [email protected] > > Subject: Nutch cluster > > > > Which is the best approach to setup a nutch cluster with multiple nutch > > instances running on different machines. Is there some kind of scheduler > > for nutch? > > > > I already configured a single nutch instance with HBase for storing the > > index in the background. > > > > Thanks > > > > flo > > > -- Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com http://twitter.com/digitalpebble

