Author: sujen
Date: Wed Jan 27 16:37:58 2016
New Revision: 1727122
URL: http://svn.apache.org/viewvc?rev=1727122&view=rev
Log:
NUTCH-2206 Provide example scoring.similarity.stopword.file
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/conf/nutch-default.xml
Modified: nutch/trunk/CHANGES.txt
URL:
http://svn.apache.org/viewvc/nutch/trunk/CHANGES.txt?rev=1727122&r1=1727121&r2=1727122&view=diff
==============================================================================
--- nutch/trunk/CHANGES.txt (original)
+++ nutch/trunk/CHANGES.txt Wed Jan 27 16:37:58 2016
@@ -1,5 +1,7 @@
Nutch Change Log
+* NUTCH-2206 Provide example scoring.similarity.stopword.file (sujen)
+
* NUTCH-2204 Remove junit lib from runtime (snagel)
* NUTCH-2201 Remove loops program from webgraph package (markus)
Modified: nutch/trunk/conf/nutch-default.xml
URL:
http://svn.apache.org/viewvc/nutch/trunk/conf/nutch-default.xml?rev=1727122&r1=1727121&r2=1727122&view=diff
==============================================================================
--- nutch/trunk/conf/nutch-default.xml (original)
+++ nutch/trunk/conf/nutch-default.xml Wed Jan 27 16:37:58 2016
@@ -1383,6 +1383,28 @@ CAUTION: Set the parser.timeout to -1 or
</description>
</property>
+<!-- scoring similarity properties
+Add scoring-similarity to the list of active plugins
+ in the parameter 'plugin.includes' in order to use it.
+For more detailed information on the working of this filter
+visit https://wiki.apache.org/nutch/SimilarityScoringFilter-->
+
+<property>
+ <name>scoring.similarity.model.path</name>
+ <value>goldstandard.txt</value>
+ <description>Path to the gold standard file which contains all the
relevant text and terms,
+ pertaining to the domain.
+ </description>
+</property>
+
+ <property>
+ <name>scoring.similarity.stopword.file</name>
+ <value>stopwords.txt</value>
+ <description>Name of the stopword text file. The user can specify a custom
list of stop words
+ in a text file. Each new stopword should be on a new line.
+ </description>
+</property>
+
<!-- language-identifier plugin properties -->
<property>