Update: The nutch configuration files need to go in the hadoop conf file. Maybe someone could recommend some best practices regarding the file structure? Should all the nutch config files simply be copied to the hadoop conf directory? Currently I have:
/webcrawler/hadoop /webcrawler/nutch I guess im a bit confused because 1.3 didn't come bundled with hadoop. Thanks! ~Jason On Mon, Jun 13, 2011 at 12:07 PM, Jason Stubblefield < [email protected]> wrote: > Hello, > > I'm trying to fetch a segment using hadoop on a single node with nutch 1.3. > I seem to be struggling with the new runtime configuration. I have hadoop > up and running and have successfully run the readdb -stats command and > generated a sement, but when I run: > > runtime/deploy/bin/nutch fetch crawl/segments/20110613103305 -threads 8 > > I get an error message: No agents listed in 'http.agent.name' property > > I noticed there are now 2 conf files, one at trunk/conf and the other at > trunk/runtime/local/conf, and hae updated both of them with my > nutch-site.xml file, both have a properly configured http.agent.name. > > Do I need to explicitly declare the conf directory somewhere? Do in need > to move the conf file to trunk/runtime/deploy/conf, or put it somewhere > else? What am i missing? > > Thanks in advance! > > ~Jason >

