> There is no need to run Nutch following the tutorial, the tutorial is
> extremely out dated and confusing,
>

You are welcome to contribute and improve it



> this worked for me:
> bin/hadoop jar nutch-1.3.job org.apache.nutch.crawl.Crawl urls -dir
> crawl -depth 3 -topN 50
>
> I got it from:
>
> http://www.marco.bianchi.name/myPortal/how-to-run-nutch-13-in-distributed-mode.aspx
>
> Thanks,
> Peyman
>
> On Fri, Dec 23, 2011 at 2:11 AM, Markus Jelsma
> <[email protected]> wrote:
> > Something on your path is missing. What if you upgrade to Nutch 1.4 and
> try
> > again?
> >
> > On Thursday 22 December 2011 02:47:21 Peyman Mohajerian wrote:
> >> Hi Guys,
> >>
> >> I run Nutch fine without using Hadoop, but following:
> >> http://wiki.apache.org/nutch/NutchHadoopTutorial
> >> I get this error when I start crawling:
> >> class not found exception on: org/apache/hadoop/util/PlatformName
> >>
> >> This class is in hadoop-core-0.20.2.jar that comes with Nutch1.3.
> >> Initially i didn't copy this file to my 'nutch/lib' directory because
> >> I assumed hadoop already has this jar and I don't have to copy it from
> >> Nutch lib over. But due to the above error I decided to copy it over,
> >> but it didn't help. I'm assuming there is a jar conflict at some
> >> point. The tutorial is not clear, what I understand from it is that
> >> I'm supposed to merge all the lib, bin, conf from both hadoop and
> >> nutch in one location and there are some incompatible jars. I'm using
> >> Hadoop  .20.205, Running any Map/Reduce job or copying stuff to hdfs
> >> works just fine.
> >>
> >> Any ideas?
> >>
> >> Thanks
> >> Peyman
> >>
> >> here is the stack:
> >> peyman@ubuntu:/host/Users/Peyman/Documents/hadoop-0.20.205.0/nutch$
> >> bin/nutch crawl /user/peyman/urls -dir fbprofilecrawl -depth 3 -topN
> >> 50
> >> Exception in thread "main" java.lang.NoClassDefFoundError:
> >> org/apache/hadoop/util/PlatformName
> >> Caused by: java.lang.ClassNotFoundException:
> >> org.apache.hadoop.util.PlatformName at
> >> java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> >>       at java.security.AccessController.doPrivileged(Native Method)
> >>       at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> >>       at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
> >>       at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> >>       at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
> >> Could not find the main class: org.apache.hadoop.util.PlatformName.
> >> Program will exit.
> >> solrUrl is not set, indexing will be skipped...
> >
> > --
> > Markus Jelsma - CTO - Openindex
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com

Reply via email to