I'm trying to crawl using Nutch 2. I check out source from http://svn.apache.org/repos/asf/nutch/branches/2.x/ and config with mysql.
I get error but when run nutch 1.5 everything okie :( mkdir urls echo nutch.apache.org > urls/seed.txt runtime/deploy/bin/nutch inject urls 12/08/07 11:25:38 INFO crawl.InjectorJob: InjectorJob: starting 12/08/07 11:25:38 INFO crawl.InjectorJob: InjectorJob: urlDir: urls 12/08/07 11:25:41 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 12/08/07 11:25:44 INFO input.FileInputFormat: Total input paths to process : 1 12/08/07 11:25:45 INFO util.NativeCodeLoader: Loaded the native-hadoop library 12/08/07 11:25:45 WARN snappy.LoadSnappy: Snappy native library is available 12/08/07 11:25:45 INFO snappy.LoadSnappy: Snappy native 12/08/07 11:25:47 INFO mapred.JobClient: map 0% reduce 0% 12/08/07 11:26:01 INFO mapred.JobClient: Task Id : attempt_201208071123_0001_m_000000_0, Status : FAILED Error: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected attempt_201208071123_0001_m_000000_0: SLF4J: Class path contains multiple SLF4J bindings. attempt_201208071123_0001_m_000000_0: SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] attempt_201208071123_0001_m_000000_0: SLF4J: Found binding in [jar:file:/var/lib/hadoop-hdfs/cache/mapred/mapred/local/taskTracker/root/jobcache/job_201208071123_0001/jars/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] attempt_201208071123_0001_m_000000_0: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 12/08/07 11:26:05 INFO mapred.JobClient: Task Id : attempt_201208071123_0001_m_000000_1, Status : FAILED Error: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected attempt_201208071123_0001_m_000000_1: SLF4J: Class path contains multiple SLF4J bindings. attempt_201208071123_0001_m_000000_1: SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] attempt_201208071123_0001_m_000000_1: SLF4J: Found binding in [jar:file:/var/lib/hadoop-hdfs/cache/mapred/mapred/local/taskTracker/root/jobcache/job_201208071123_0001/jars/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] attempt_201208071123_0001_m_000000_1: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 12/08/07 11:26:10 INFO mapred.JobClient: Task Id : attempt_201208071123_0001_m_000000_2, Status : FAILED Error: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected attempt_201208071123_0001_m_000000_2: SLF4J: Class path contains multiple SLF4J bindings. attempt_201208071123_0001_m_000000_2: SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] attempt_201208071123_0001_m_000000_2: SLF4J: Found binding in [jar:file:/var/lib/hadoop-hdfs/cache/mapred/mapred/local/taskTracker/root/jobcache/job_201208071123_0001/jars/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] attempt_201208071123_0001_m_000000_2: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 12/08/07 11:26:19 INFO mapred.JobClient: Job complete: job_201208071123_0001 12/08/07 11:26:19 INFO mapred.JobClient: Counters: 7 12/08/07 11:26:19 INFO mapred.JobClient: Job Counters 12/08/07 11:26:19 INFO mapred.JobClient: Failed map tasks=1 12/08/07 11:26:19 INFO mapred.JobClient: Launched map tasks=4 12/08/07 11:26:19 INFO mapred.JobClient: Data-local map tasks=4 12/08/07 11:26:19 INFO mapred.JobClient: Total time spent by all maps in occupied slots (ms)=18003 12/08/07 11:26:19 INFO mapred.JobClient: Total time spent by all reduces in occupied slots (ms)=0 12/08/07 11:26:19 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 12/08/07 11:26:19 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 12/08/07 11:26:19 ERROR crawl.InjectorJob: InjectorJob: java.lang.RuntimeException: job failed: name=inject-p1 urls, jobid=job_201208071123_0001 at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:47) at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:248) at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:268) at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:288) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:298) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Thanks Trần Anh Tuấn. Phone: 0989896118 Yahoo: tk1cntt Skype: tk1cntt

