Re: Nutch 1.8 on hadoop

Julien Nioche Mon, 19 May 2014 04:15:31 -0700

Hi Ali

See http://wiki.apache.org/nutch/NutchHadoopSingleNodeTutorial for the
hadoop cluster part.
Re- Crawl class : there is now a script for crawling, see bin/crawl.sh. It
is easier to modify than the all in one Crawl class and gives a good
understanding of the underlying processing steps.


Best

Julien


On 19 May 2014 11:55, Ali Nazemian <[email protected]> wrote:

> Hi,
> I was wondering how can I run nutch 1.8 job on hadoop cluster? As far as
> know for running nutch 1.7 job on hadoop cluster we could use
> org.apache.nutch.crawl.Crawl class. Since this class was deprecated and
> removed from nutch 1.8 what class would be responsible for crawling job?
> Which class should I put in hadoop command?
> Best regards.
> --
> A.Nazemian
>



-- 

Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Re: Nutch 1.8 on hadoop

Reply via email to