Nevermind, I'll take a deeper look into that paper :)

On 29.06.2011 00:03, Ted Dunning wrote:
Hmmm... not sure.  I thought they were all the same.  It is possible
there is a left-over implementation.

Robin?  Care to comment?

On Tue, Jun 28, 2011 at 3:01 PM, Sebastian Schelter <[email protected]
<mailto:[email protected]>> wrote:

    Is org.apache.mahout.classifier.__naivebayes also based on that one?
    I thought it was only relevant for org.apache.mahout.classifier.__bayes?


    On 28.06.2011 23:58, Ted Dunning wrote:

        See here:
        
http://citeseerx.ist.psu.edu/__viewdoc/summary?doi=10.1.1.13.__8572&rank=1
        <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.13.8572&rank=1>

        On Tue, Jun 28, 2011 at 2:43 PM, Sebastian Schelter (JIRA)
        <[email protected] <mailto:[email protected]>>wrote:


                [
            
https://issues.apache.org/__jira/browse/MAHOUT-746?page=__com.atlassian.jira.plugin.__system.issuetabpanels:comment-__tabpanel&focusedCommentId=__13056805#comment-13056805
            
<https://issues.apache.org/jira/browse/MAHOUT-746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13056805#comment-13056805>]

            Sebastian Schelter commented on MAHOUT-746:
            ------------------------------__-------------

            Thank you very much, Sean.

            I wonder whether there is some article/paper that describes
            this particular
            approach of implementing Naive Bayes? A colleague of mine
            with a much deeper
            statistics background and me took a look at the details of
            the computation
            today and we were left with some open questions.

                Refactoring of the parallel Naive Bayes implementation in

            org.apache.mahout.classifier.__naivebayes


            
------------------------------__------------------------------__------------------------------__-------


                                 Key: MAHOUT-746
                                 URL:
                https://issues.apache.org/__jira/browse/MAHOUT-746
                <https://issues.apache.org/jira/browse/MAHOUT-746>
                             Project: Mahout
                          Issue Type: Improvement
                          Components: Classification
                    Affects Versions: 0.6
                            Reporter: Sebastian Schelter
                            Assignee: Sebastian Schelter
                             Fix For: 0.6

                         Attachments: MAHOUT-746.patch


                I refactored the code in
                org.apache.mahout.classifier.__naivebayes to

            extend AbstractJob, decoupled the model serialization from
            the job output,
            extracted trainer classes and tried to clarify naming and
            reduce code
            complexity. I also added tests for the training M/R code as
            well as a toy
            integration test.

                It would be great if someone could review my patch to
                make sure I didn't

            break anything.

            --
            This message is automatically generated by JIRA.
            For more information on JIRA, see:
            http://www.atlassian.com/__software/jira
            <http://www.atlassian.com/software/jira>







Reply via email to