Hi there,
For some reason nutch can't seem to find my common-terms.utf8 file. I have
placed it under WEB-INF, WEB-INF/classes and even under WEB-INF/lib.
In my nutch-default.xml the path to the file is as follows
<property>
<name>analysis.common.terms.file</name>
<value>common-terms.utf8</value>
<description>The name of a file containing a list of common terms
that should be indexed in n-grams.</description>
</property>
Finally I also tried giving an absolute path but no go
<property>
<name>analysis.common.terms.file</name>
<value>/var/log/common-terms.utf8</value>
<description>The name of a file containing a list of common terms
that should be indexed in n-grams.</description>
</property>
Error that I see in my log file is as follows
DEBUG org.apache.hadoop.conf.Configuration - java.io.IOException: config()
at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:67)
at
org.apache.nutch.util.NutchConfiguration.create(NutchConfiguration.java:50)
at
org.apache.nutch.util.NutchConfiguration.get(NutchConfiguration.java:72)
My nutch version is nutch-0.8.1.jar. Any help would be much appreciated.
Thanks.
--
View this message in context:
http://www.nabble.com/common-terms.utf8-not-being-found-tp22321026p22321026.html
Sent from the Nutch - User mailing list archive at Nabble.com.