Hi,

> So where is Nutch in Java loading the configuration file from? (and how can
> I overwrite it)

– configuration files are found via Java’s classpath

– only the first instance of each file found in one
  of the directories of the classpath is used

– settings in nutch-site.xml overwrite settings in nutch-default.xml
  DO NOT CHANGE nutch-default.xml, only change nutch-site.xml

– the content of the environment variable NUTCH_CONF_DIR
  is added by the script $NUTCH_HOME/bin/nutch in front of the classpath

> I also have the proper (I think) environment variables set: /NUTCH_HOME/ and
> /NUTCH_CONF_DIR/.

You should get the Java call via the process listing (ps or top) and check
the classpath carefully.

For details, see
http://wiki.apache.org/hadoop/HowToConfigure
http://wiki.apache.org/nutch/NutchConfigurationFiles

Sebastian

On 02/21/2013 09:03 PM, imehesz wrote:
> hello,
> 
> I finally crossed all the terminal issues and I can run Nutch and Solr with
> no problems from the command line.
> 
> When I try to implement Nutch crawling from JAVA, it's a different story.
> The error message is pretty self-explanatory: 
> /Fetcher: No agents listed in 'http.agent.name' property./
> 
> I have of course, set the value in my /conf/nutch-default.xml/ and created a
> symlink from /conf/nutch-site.xml/, to make sure they are the same. (it
> didn't work separately either, btw).
> 
> I also have the proper (I think) environment variables set: /NUTCH_HOME/ and
> /NUTCH_CONF_DIR/.
> 
> When I step through Nutch's fetcher code, I can definitely see that the
> /agent.name/ property is *NULL*. 
> At this point I purposely broke the XML settings file, and saw that it
> doesn't even load that file the begin with. (as in nothing changes)
> 
> So where is Nutch in Java loading the configuration file from? (and how can
> I overwrite it)
> 
> thanks,
> ---iM
> 
> 
> 
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Nutch-1-6-with-Java-not-loading-correct-configuration-file-tp4041998.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
> 

Reply via email to