Hi Jason,

If you have hadoop running independently from Nutch you should use
runtime/deploy/bin. The conf files can go directly in the hadoop/conf dir or
in the Nutch job which you will need to regenerate with 'ant job' so that it
reflects the changes you made in NUTCH/conf

Julien

On 13 June 2011 11:59, Jason Stubblefield
<[email protected]>wrote:

> Update:  The nutch configuration files need to go in the hadoop conf file.
>
> Maybe someone could recommend some best practices regarding the file
> structure?  Should all the nutch config files simply be copied to the
> hadoop
> conf directory?  Currently I have:
>
> /webcrawler/hadoop
> /webcrawler/nutch
>
> I guess im a bit confused because 1.3 didn't come bundled with hadoop.
>
> Thanks!
>
> ~Jason
>
> On Mon, Jun 13, 2011 at 12:07 PM, Jason Stubblefield <
> [email protected]> wrote:
>
> > Hello,
> >
> > I'm trying to fetch a segment using hadoop on a single node with nutch
> 1.3.
> >  I seem to be struggling with the new runtime configuration.  I have
> hadoop
> > up and running and have successfully run the readdb -stats command and
> > generated a sement, but when I run:
> >
> > runtime/deploy/bin/nutch fetch crawl/segments/20110613103305 -threads 8
> >
> > I get an error message: No agents listed in 'http.agent.name' property
> >
> > I noticed there are now 2 conf files, one at trunk/conf and the other at
> > trunk/runtime/local/conf, and hae updated both of them with my
> > nutch-site.xml file, both have a properly configured http.agent.name.
> >
> > Do I need to explicitly declare the conf directory somewhere?  Do in need
> > to move the conf file to trunk/runtime/deploy/conf, or put it somewhere
> > else?  What am i missing?
> >
> > Thanks in advance!
> >
> > ~Jason
> >
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com

Reply via email to