Hi For testing purpose, I updated the bin/nutch file to pass the -conf file > along with the hadoop jar command. Now it seems to be taking the supplied > nutch-site.xml. The second problem is the files that are referred in > nutch-site.xml are not found like regex-urlfilter-1.txt. > > Any idea how to make these additional files available in the classpath. I > tried adding the file in all the node in specific directory and added in > CLASSPATH too. But still no luck. >
The resource files are taken from the job jars. You could have different regex-urlfilter files per config with a different name in the job jar and get the config file to specify which one to use. Julien -- Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com http://twitter.com/digitalpebble

