Hi I have used nutch 2.3 so don't know it would help with 1.x. In the deploy folder there is a crawl script in bin folder.
*runtime/deploy/bin/crawl /tmp/seed.txt group_a 1000 * the seed.txt file should copied to hdfs. Thanks Divjot On Wed, Nov 2, 2016 at 9:40 PM, Michael Coffey <[email protected]> wrote: > I'm having trouble trying to get Nutch 1.12 to run on hadoop 2.7.3. > I get a class not found exception for org.apache.nutch.crawl.Crawl, as in > the following attempt. > $HADOOP_HOME/bin/hadoop jar "/home/mjc/apache-nutch-1.12/ > runtime/deploy/apache-nutch-1.12.job" org.apache.nutch.crawl.Crawl seed > -dir seed -depth 1 -topN 5Exception in thread "main" > java.lang.ClassNotFoundException: > org.apache.nutch.crawl.Crawl at java.net.URLClassLoader$1.run( > URLClassLoader.java:366) > > Searching the web, I see that things seem to have changed in recent > versions of Nutch. However, I have not been able to find a good tutorial or > step-by-step guide for how to get this to work. I would appreciate any > advice you could give. Is there documentation somewhere? Should I be using > an older version?? > >

