Hi, > So where is Nutch in Java loading the configuration file from? (and how can > I overwrite it)
– configuration files are found via Java’s classpath – only the first instance of each file found in one of the directories of the classpath is used – settings in nutch-site.xml overwrite settings in nutch-default.xml DO NOT CHANGE nutch-default.xml, only change nutch-site.xml – the content of the environment variable NUTCH_CONF_DIR is added by the script $NUTCH_HOME/bin/nutch in front of the classpath > I also have the proper (I think) environment variables set: /NUTCH_HOME/ and > /NUTCH_CONF_DIR/. You should get the Java call via the process listing (ps or top) and check the classpath carefully. For details, see http://wiki.apache.org/hadoop/HowToConfigure http://wiki.apache.org/nutch/NutchConfigurationFiles Sebastian On 02/21/2013 09:03 PM, imehesz wrote: > hello, > > I finally crossed all the terminal issues and I can run Nutch and Solr with > no problems from the command line. > > When I try to implement Nutch crawling from JAVA, it's a different story. > The error message is pretty self-explanatory: > /Fetcher: No agents listed in 'http.agent.name' property./ > > I have of course, set the value in my /conf/nutch-default.xml/ and created a > symlink from /conf/nutch-site.xml/, to make sure they are the same. (it > didn't work separately either, btw). > > I also have the proper (I think) environment variables set: /NUTCH_HOME/ and > /NUTCH_CONF_DIR/. > > When I step through Nutch's fetcher code, I can definitely see that the > /agent.name/ property is *NULL*. > At this point I purposely broke the XML settings file, and saw that it > doesn't even load that file the begin with. (as in nothing changes) > > So where is Nutch in Java loading the configuration file from? (and how can > I overwrite it) > > thanks, > ---iM > > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Nutch-1-6-with-Java-not-loading-correct-configuration-file-tp4041998.html > Sent from the Nutch - User mailing list archive at Nabble.com. >