Asitang Mishra created NUTCH-2136:
-------------------------------------
Summary: Implement a different version of Naive Bayes Parse Filter
Key: NUTCH-2136
URL: https://issues.apache.org/jira/browse/NUTCH-2136
Project: Nutch
Issue Type: Improvement
Components: parser
Reporter: Asitang Mishra
Fix For: 1.10
There has been many dependency issues with the first implementation of Naive
Bayes Parse Filter. The major dependencies were Mahout and Lucene. There was
also the issue where the training process failed in the distributed mode due to
the fact that a nested hadoop job was unable to run on the cluster.
To remove all these issues and make the filter be able to run in a distributed
environment I am going to implement my own version of Naive Bayes without any
dependency on any machine learning libraries.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)