On 2010-10-28 12:30, Claudio Martella wrote:
> Hello list,
> 
> I have a hadoop cluster where I'd like to run nutch for crawling my
> repositories. I'm currently running cloudera's hadoop
> 0.20.2+737-1~lenny-cdh3b3 and nutch1.1 or nutch1.2. When I try to run:
> 
> $ hadoop jar build/nutch-1.2.job org.apache.nutch.crawl.Crawl
> /crawls/urls/ -depth 15 -dir /crawls/ -solr http://searchserver:8080/solr/

> about the x point ... URLNormalizer not found i read on some old
> archives that it could be due to jar format discrepancies between
> nutch's .job and what the hadoop cluster is expecting. Do you have any idea?

This error indicates that plugin.folders or plugin.includes are set to
incorrect values, so that Nutch plugins can't be found - could you
please check the job.xml file that is created by JobTracker for this job
(accessible via Hadoop web UI) and see what are the values of these
properties?

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to