Fixed it. Turns out I'd copied the conf files to the wrong directory. However, I'm having trouble running my java code. Previously I put my jar into the runtime/local/lib directory and then called bin/nutch myClass. I put my jar in the hadoop/lob directory, but I'm still getting a ClassNotFoundException.
On Mon, Nov 28, 2011 at 3:38 PM, Bai Shen <[email protected]> wrote: > I've changed nutch to use the pseudo-distributed mode, but it keeps > erroring out that no agent is listed in the http.agent.name property. I > copied over my conf directory from local, but that didn't fix it. What am > I missing? > > > On Mon, Nov 28, 2011 at 9:23 AM, Julien Nioche < > [email protected]> wrote: > >> Simply run Nutch in pseudo-distributed mode. If you have no idea of what >> this means, then it would be a good idea to have a look at >> http://hadoop.apache.org/common/docs/stable/single_node_setup.html and in >> particular the section mentioning http://localhost:50030/jobtracker.jsp >> >> On 28 November 2011 14:09, Bai Shen <[email protected]> wrote: >> >> > We looked at the hadoop reporter and aren't sure how to access it with >> > nutch. Is there a certain way it works? Can you give me an example? >> > Thanks. >> > >> > On Mon, Nov 21, 2011 at 3:11 PM, Markus Jelsma >> > <[email protected]>wrote: >> > >> > > ** >> > > >> > > > On Thu, Nov 10, 2011 at 3:32 PM, Markus Jelsma >> > > >> > > > >> > > >> > > > <[email protected]>wrote: >> > > >> > > > > > Interesting. How do you tell if the segments have been fetched, >> > etc? >> > > >> > > > > >> > > >> > > > > after a job the shell script waits for its completion and return >> > code. >> > > If >> > > >> > > > > it >> > > >> > > > > returns 0 all is fine and we move it to another queue. If != 0 >> then >> > > >> > > > > there's an >> > > >> > > > > error and reports via mail. >> > > >> > > > > >> > > >> > > > > Ah, okay. I didn't realize it was returning an error code. >> > > >> > > > > >> > > >> > > > > > How >> > > >> > > > > > do you know if there are any urls that had problems? >> > > >> > > > > >> > > >> > > > > Hadoop reporter shows statistics. There are always many errors for >> > many >> > > >> > > > > reasons. This is normal because we crawl everything. >> > > >> > > > >> > > >> > > > How are you running Hadoop reporter? >> > > >> > > You'll get it for free when operating a Hadoop cluster. >> > > >> > > > >> > > >> > > > > > Or fetch jobs that >> > > >> > > > > > errored out, etc. >> > > >> > > > > >> > > >> > > > > The non-zero return code. >> > > >> > >> >> >> >> -- >> * >> *Open Source Solutions for Text Engineering >> >> http://digitalpebble.blogspot.com/ >> http://www.digitalpebble.com >> > >

