Is org.apache.mahout.classifier.naivebayes also based on that one? I thought it was only relevant for org.apache.mahout.classifier.bayes?

On 28.06.2011 23:58, Ted Dunning wrote:
See here:
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.13.8572&rank=1

On Tue, Jun 28, 2011 at 2:43 PM, Sebastian Schelter (JIRA)
<[email protected]>wrote:


    [
https://issues.apache.org/jira/browse/MAHOUT-746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13056805#comment-13056805]

Sebastian Schelter commented on MAHOUT-746:
-------------------------------------------

Thank you very much, Sean.

I wonder whether there is some article/paper that describes this particular
approach of implementing Naive Bayes? A colleague of mine with a much deeper
statistics background and me took a look at the details of the computation
today and we were left with some open questions.

Refactoring of the parallel Naive Bayes implementation in
org.apache.mahout.classifier.naivebayes

-------------------------------------------------------------------------------------------------

                 Key: MAHOUT-746
                 URL: https://issues.apache.org/jira/browse/MAHOUT-746
             Project: Mahout
          Issue Type: Improvement
          Components: Classification
    Affects Versions: 0.6
            Reporter: Sebastian Schelter
            Assignee: Sebastian Schelter
             Fix For: 0.6

         Attachments: MAHOUT-746.patch


I refactored the code in org.apache.mahout.classifier.naivebayes to
extend AbstractJob, decoupled the model serialization from the job output,
extracted trainer classes and tried to clarify naming and reduce code
complexity. I also added tests for the training M/R code as well as a toy
integration test.
It would be great if someone could review my patch to make sure I didn't
break anything.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira





Reply via email to