Hi Jason, If you have hadoop running independently from Nutch you should use runtime/deploy/bin. The conf files can go directly in the hadoop/conf dir or in the Nutch job which you will need to regenerate with 'ant job' so that it reflects the changes you made in NUTCH/conf
Julien On 13 June 2011 11:59, Jason Stubblefield <[email protected]>wrote: > Update: The nutch configuration files need to go in the hadoop conf file. > > Maybe someone could recommend some best practices regarding the file > structure? Should all the nutch config files simply be copied to the > hadoop > conf directory? Currently I have: > > /webcrawler/hadoop > /webcrawler/nutch > > I guess im a bit confused because 1.3 didn't come bundled with hadoop. > > Thanks! > > ~Jason > > On Mon, Jun 13, 2011 at 12:07 PM, Jason Stubblefield < > [email protected]> wrote: > > > Hello, > > > > I'm trying to fetch a segment using hadoop on a single node with nutch > 1.3. > > I seem to be struggling with the new runtime configuration. I have > hadoop > > up and running and have successfully run the readdb -stats command and > > generated a sement, but when I run: > > > > runtime/deploy/bin/nutch fetch crawl/segments/20110613103305 -threads 8 > > > > I get an error message: No agents listed in 'http.agent.name' property > > > > I noticed there are now 2 conf files, one at trunk/conf and the other at > > trunk/runtime/local/conf, and hae updated both of them with my > > nutch-site.xml file, both have a properly configured http.agent.name. > > > > Do I need to explicitly declare the conf directory somewhere? Do in need > > to move the conf file to trunk/runtime/deploy/conf, or put it somewhere > > else? What am i missing? > > > > Thanks in advance! > > > > ~Jason > > > -- * *Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com

