Hi, There is no need to run Nutch following the tutorial, the tutorial is extremely out dated and confusing, this worked for me: bin/hadoop jar nutch-1.3.job org.apache.nutch.crawl.Crawl urls -dir crawl -depth 3 -topN 50
I got it from: http://www.marco.bianchi.name/myPortal/how-to-run-nutch-13-in-distributed-mode.aspx Thanks, Peyman On Fri, Dec 23, 2011 at 2:11 AM, Markus Jelsma <[email protected]> wrote: > Something on your path is missing. What if you upgrade to Nutch 1.4 and try > again? > > On Thursday 22 December 2011 02:47:21 Peyman Mohajerian wrote: >> Hi Guys, >> >> I run Nutch fine without using Hadoop, but following: >> http://wiki.apache.org/nutch/NutchHadoopTutorial >> I get this error when I start crawling: >> class not found exception on: org/apache/hadoop/util/PlatformName >> >> This class is in hadoop-core-0.20.2.jar that comes with Nutch1.3. >> Initially i didn't copy this file to my 'nutch/lib' directory because >> I assumed hadoop already has this jar and I don't have to copy it from >> Nutch lib over. But due to the above error I decided to copy it over, >> but it didn't help. I'm assuming there is a jar conflict at some >> point. The tutorial is not clear, what I understand from it is that >> I'm supposed to merge all the lib, bin, conf from both hadoop and >> nutch in one location and there are some incompatible jars. I'm using >> Hadoop .20.205, Running any Map/Reduce job or copying stuff to hdfs >> works just fine. >> >> Any ideas? >> >> Thanks >> Peyman >> >> here is the stack: >> peyman@ubuntu:/host/Users/Peyman/Documents/hadoop-0.20.205.0/nutch$ >> bin/nutch crawl /user/peyman/urls -dir fbprofilecrawl -depth 3 -topN >> 50 >> Exception in thread "main" java.lang.NoClassDefFoundError: >> org/apache/hadoop/util/PlatformName >> Caused by: java.lang.ClassNotFoundException: >> org.apache.hadoop.util.PlatformName at >> java.net.URLClassLoader$1.run(URLClassLoader.java:202) >> at java.security.AccessController.doPrivileged(Native Method) >> at java.net.URLClassLoader.findClass(URLClassLoader.java:190) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:306) >> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:247) >> Could not find the main class: org.apache.hadoop.util.PlatformName. >> Program will exit. >> solrUrl is not set, indexing will be skipped... > > -- > Markus Jelsma - CTO - Openindex

