@ Ian, Apologies, this one slipped through the net.
On Wed, Sep 26, 2012 at 8:26 PM, Ian Truslove <[email protected]> wrote: > ubuntu:~/apache-nutch-svn-2.1$ ~/hadoop-1.0.3/bin/hadoop jar > build/apache-nutch-2.1.job org.apache.nutch.crawl.Crawler urls -dir urls > -depth 3 -topN 5 The above is not required. Once you've ssh'd into your Job Tracker all you need to do is head to your deploy directory and use the ./nutch script from there. It will save you a power of time rather than hard coding all of the classes to console. You should not experience any problems running on locally deploy hadoop cluster however we are aware of problems with CDH4. > > Unzipping the .job file shows gora.properties exists in the root directory > of the archive. Is it the SqlStore that you wish to use to store your webdb? If so then you're gora.properties config seems to be OK, however if not then you'll need to edit and recompile the .job using 'ant job' Please let us know how you get on. Lewis -- Lewis

