Hi Ali

See http://wiki.apache.org/nutch/NutchHadoopSingleNodeTutorial for the
hadoop cluster part.
Re- Crawl class : there is now a script for crawling, see bin/crawl.sh. It
is easier to modify than the all in one Crawl class and gives a good
understanding of the underlying processing steps.

Best

Julien


On 19 May 2014 11:55, Ali Nazemian <[email protected]> wrote:

> Hi,
> I was wondering how can I run nutch 1.8 job on hadoop cluster? As far as
> know for running nutch 1.7 job on hadoop cluster we could use
> org.apache.nutch.crawl.Crawl class. Since this class was deprecated and
> removed from nutch 1.8 what class would be responsible for crawling job?
> Which class should I put in hadoop command?
> Best regards.
> --
> A.Nazemian
>



-- 

Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Reply via email to