Hi,
is there a known problem with hadop .3.1 and nutch classloading or  
job file usage?

I wrote a custom tool and want to start it via:
bin/nutch myclass  crawldb 1000

But found only following exception in the task reporter messages:

java.lang.RuntimeException: java.lang.RuntimeException:  
java.lang.ClassNotFoundException: org.apache.nutch.crawl.CrawlDatum  
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java: 
263) at org.apache.hadoop.mapred.JobConf.getOutputValueClass 
(JobConf.java:351) at  
org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobConf.java: 
314) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:77) at  
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:847)  
Caused by: java.lang.RuntimeException:  
java.lang.ClassNotFoundException: org.apache.nutch.crawl.CrawlDatum  
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java: 
247) at org.apache.hadoop.conf.Configuration.getClass 
(Configuration.java:258) ... 4 more Caused by:  
java.lang.ClassNotFoundException: org.apache.nutch.crawl.CrawlDatum  
at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at  
java.security.AccessController.doPrivileged(Native Method) at  
java.net.URLClassLoader.findClass(URLClassLoader.java:188) at  
java.lang.ClassLoader.loadClass(ClassLoader.java:306) at  
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:268) at  
java.lang.ClassLoader.loadClass(ClassLoader.java:251) at  
org.apache.hadoop.conf.Configuration.getClass(Configuration.java: 
245) ... 5 more

Looks like when starting a class with bin/nutch the nutch-XXX.job  
file not used?
How to force to use the nutch job file?
I solved this problem by copy the nutch jar file to lib and restart  
the complete system to trigger the rsync process.


Thanks any hints.
Stefan





_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to