This is definitely a hadoop problem. This is similar to the classpath issues that we were encountering before with Hadoop and the ReductTaskRunner. When I include the nutch-*.jar in the hadoop class path the errors go away. Not a fix but it proves the point that this is an issue with Hadoop class loading.
Dennis Kubes Dennis Kubes wrote: > I spoke too soon. Below is the output of errors on mergesegs. This > looks more like a Hadoop issue to me, but I will need to dig into it. It > also may be something that I am doing on my end. This was a merge of > three different crawls of 50K each. I don't know if we want to delay or > go ahead. > > Dennis Kubes > > java.lang.RuntimeException: java.lang.RuntimeException: > java.lang.ClassNotFoundException: org.apache.nutch.metadata.MetaWrapper > at > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:344) > at > org.apache.hadoop.mapred.JobConf.getOutputValueClass(JobConf.java:451) > at > org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobConf.java:414) > at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:270) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:115) > at > org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1445) > Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: > org.apache.nutch.metadata.MetaWrapper > at > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:328) > at > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:339) > ... 5 more > Caused by: java.lang.ClassNotFoundException: > org.apache.nutch.metadata.MetaWrapper > at java.net.URLClassLoader$1.run(URLClassLoader.java:200) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:188) > at java.lang.ClassLoader.loadClass(ClassLoader.java:306) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:268) > at java.lang.ClassLoader.loadClass(ClassLoader.java:251) > at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:242) > at > org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:315) > at > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:326) > ... 6 more > > > > Dennis Kubes wrote: >> [X] +1 Release the packages as Apache Nutch 0.9 >> [ ] -1 Do not release the packages because... >> >> I have been running some bigger crawls with the release this morning. >> Everything looks good. >> >> Dennis Kubes >> >> Chris Mattmann wrote: >>> Hi Folks, >>> >>> I have posted a candidate for the Apache Nutch 0.9 release at >>> >>> http://people.apache.org/~mattmann/nutch_0.9/ >>> >>> See the included CHANGES-0.9.txt file for details on release >>> contents and latest changes. The release was made from the 0.9-dev >>> trunk. >>> >>> Please vote on releasing these packages as Apache Nutch 0.9. >>> The vote is open for the next 72 hours. Only votes from Nutch >>> committers are binding, but everyone is welcome to check the release >>> candidate and voice their approval or disapproval. The vote passes if >>> at least three binding +1 votes are cast. >>> >>> [ ] +1 Release the packages as Apache Nutch 0.9 >>> [ ] -1 Do not release the packages because... >>> >>> Thanks! >>> >>> Cheers, >>> Chris >>> >>> ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
