[
https://issues.apache.org/jira/browse/NUTCH-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14952854#comment-14952854
]
ASF GitHub Bot commented on NUTCH-2136:
---------------------------------------
GitHub user asitang opened a pull request:
https://github.com/apache/nutch/pull/71
NUTCH-2136
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/asitang/nutch NUTCH-2136
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/nutch/pull/71.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #71
----
commit f9aa3f50a113c22702abad9926e2488f87485722
Author: Asitang Mishra <[email protected]>
Date: 2015-10-12T09:27:47Z
dependencies removed from ivy.xml and plugin.xml. Changed the
implementation of Naive Bayes ParseFilter
commit 5a3cc9b4ac5f250983ca81e62f9dff63ab5ead3f
Author: Asitang Mishra <[email protected]>
Date: 2015-10-12T09:33:38Z
made some cosmetic changes to the code
----
> Implement a different version of Naive Bayes Parse Filter
> ---------------------------------------------------------
>
> Key: NUTCH-2136
> URL: https://issues.apache.org/jira/browse/NUTCH-2136
> Project: Nutch
> Issue Type: Improvement
> Components: parser
> Reporter: Asitang Mishra
> Fix For: 1.10
>
>
> There has been many dependency issues with the first implementation of Naive
> Bayes Parse Filter. The major dependencies were Mahout and Lucene. There was
> also the issue where the training process failed in the distributed mode due
> to the fact that a nested hadoop job was unable to run on the cluster.
> To remove all these issues and make the filter be able to run in a
> distributed environment I am going to implement my own version of Naive Bayes
> without any dependency on any machine learning libraries.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)