Asitang Mishra created NUTCH-2136:
-------------------------------------

             Summary: Implement a different version of Naive Bayes Parse Filter
                 Key: NUTCH-2136
                 URL: https://issues.apache.org/jira/browse/NUTCH-2136
             Project: Nutch
          Issue Type: Improvement
          Components: parser
            Reporter: Asitang Mishra
             Fix For: 1.10


There has been many dependency issues with the first implementation of Naive 
Bayes Parse Filter. The major dependencies were Mahout and Lucene. There was 
also the issue where the training process failed in the distributed mode due to 
the fact that  a nested hadoop job was unable to run on the cluster.
To remove all these issues and make the filter be able to run in a distributed 
environment I am going to implement my own version of Naive Bayes without any 
dependency on any machine learning libraries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to