https://wiki.apache.org/nutch/NutchHadoopTutorial
basically follow the steps in http://hadoop.apache.org/docs/stable/cluster_setup.html then install Nutch on the master node of your cluster, 'cd runtime/deploy/bin' and use the nutch scripts as usual. You can then use the standard Mapreduce webapp to monitor the progress of your crawl Julien On 21 February 2013 10:00, Amit Sela <[email protected]> wrote: > Anyone have a good tutorial about deploying nutch (1.6) on a pre-existing > Hadoop cluster ? > > Thanks. > -- * *Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com http://twitter.com/digitalpebble

