Lewis John McGibbney created NUTCH-2207:
-------------------------------------------
Summary: Remove class duplication and smarten-up
scoring-similarity plugin
Key: NUTCH-2207
URL: https://issues.apache.org/jira/browse/NUTCH-2207
Project: Nutch
Issue Type: Improvement
Components: plugin, scoring
Affects Versions: 1.11
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Fix For: 1.12
Right now it appears that DocumentVector.java is duplicated, there is also no
license header on
[ScoringFilterModel.java|https://github.com/apache/nutch/blob/trunk/src/plugin/scoring-similarity/src/java/org/apache/nutch/scoring/similarity/ScoringFilterModel.java].
I think I've also spotted a number of places that imports are not being used.
Finally, Javadoc is virtually non-existent for the scoring-similarity plugin at
all. It would help to augment some documentation.
It would be very helpful if the [SimilairittScoringFilter wiki
page|https://wiki.apache.org/nutch/SimilarityScoringFilter] was cited.
We could also do with visiting the wiki page ensuring that all references are
present.
CC [~sujenshah]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)