Hi,
There is no need to run Nutch following the tutorial, the tutorial is
extremely out dated and confusing,
this worked for me:
bin/hadoop jar nutch-1.3.job org.apache.nutch.crawl.Crawl urls -dir
crawl -depth 3 -topN 50

I got it from:
http://www.marco.bianchi.name/myPortal/how-to-run-nutch-13-in-distributed-mode.aspx

Thanks,
Peyman

On Fri, Dec 23, 2011 at 2:11 AM, Markus Jelsma
<[email protected]> wrote:
> Something on your path is missing. What if you upgrade to Nutch 1.4 and try
> again?
>
> On Thursday 22 December 2011 02:47:21 Peyman Mohajerian wrote:
>> Hi Guys,
>>
>> I run Nutch fine without using Hadoop, but following:
>> http://wiki.apache.org/nutch/NutchHadoopTutorial
>> I get this error when I start crawling:
>> class not found exception on: org/apache/hadoop/util/PlatformName
>>
>> This class is in hadoop-core-0.20.2.jar that comes with Nutch1.3.
>> Initially i didn't copy this file to my 'nutch/lib' directory because
>> I assumed hadoop already has this jar and I don't have to copy it from
>> Nutch lib over. But due to the above error I decided to copy it over,
>> but it didn't help. I'm assuming there is a jar conflict at some
>> point. The tutorial is not clear, what I understand from it is that
>> I'm supposed to merge all the lib, bin, conf from both hadoop and
>> nutch in one location and there are some incompatible jars. I'm using
>> Hadoop  .20.205, Running any Map/Reduce job or copying stuff to hdfs
>> works just fine.
>>
>> Any ideas?
>>
>> Thanks
>> Peyman
>>
>> here is the stack:
>> peyman@ubuntu:/host/Users/Peyman/Documents/hadoop-0.20.205.0/nutch$
>> bin/nutch crawl /user/peyman/urls -dir fbprofilecrawl -depth 3 -topN
>> 50
>> Exception in thread "main" java.lang.NoClassDefFoundError:
>> org/apache/hadoop/util/PlatformName
>> Caused by: java.lang.ClassNotFoundException:
>> org.apache.hadoop.util.PlatformName at
>> java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>>       at java.security.AccessController.doPrivileged(Native Method)
>>       at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>>       at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>>       at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>>       at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>> Could not find the main class: org.apache.hadoop.util.PlatformName.
>> Program will exit.
>> solrUrl is not set, indexing will be skipped...
>
> --
> Markus Jelsma - CTO - Openindex

Reply via email to