Hi,

    When I run crawler of nutch 2.0 as command:
hadoop jar /opt/nutch-2.0/runtime/deploy/apache-nutch-2.0.job 
org.apache.nutch.crawl.Crawler urls -dir output00 -depth 3 -topN 5 -threads 80

there is error info like:

12/07/18 09:13:32 INFO mapred.JobClient: Task Id : 
attempt_201207101015_0091_m_000000_2, Status : FAILED
java.lang.RuntimeException: x point org.apache.nutch.net.URLNormalizer not 
found.
  
But the url regex in conf/regex-urlfilter.txt is correct:

+^http://([a-z0-9]*\.)*apache.org
+^http://([a-z0-9]*\.)*sina.com.cn

so, what should I do?

Thks.

Ring



Reply via email to