Looks Good. Need to change the examples to use naivebayes.* instead of
bayes.*. After that bayes.* can be deprecated and phased out



On Mon, Jul 4, 2011 at 1:48 PM, Sebastian Schelter <[email protected]> wrote:

> Hi Robin,
>
> we already figured out the math. It would be great if you could do a short
> proof-read of the changes the refactoring introduced.
>
> --sebastian
>
>
> On 04.07.2011 09:06, Robin Anil wrote:
>
>>
>>
>> On Wed, Jun 29, 2011 at 3:33 AM, Ted Dunning <[email protected]
>> <mailto:[email protected]>**> wrote:
>>
>>    Hmmm... not sure.  I thought they were all the same.  It is possible
>>    there
>>    is a left-over implementation.
>>
>>    Robin?  Care to comment?
>>
>> Didnt see the thread. Both are based on same math. naivebayes one uses
>> vectors instead of text
>>
>>
>>    On Tue, Jun 28, 2011 at 3:01 PM, Sebastian Schelter <[email protected]
>>    <mailto:[email protected]>> wrote:
>>
>>     > Is org.apache.mahout.classifier.****naivebayes also based on that
>>    one? I
>>     > thought it was only relevant for
>>    org.apache.mahout.classifier.****bayes?
>>     >
>>     >
>>     > On 28.06.2011 23:58, Ted Dunning wrote:
>>     >
>>     >> See here:
>>     >> http://citeseerx.ist.psu.edu/****viewdoc/summary?doi=10.1.1.**
>> 13.** <http://citeseerx.ist.psu.edu/**viewdoc/summary?doi=10.1.1.13.**>
>>     >>
>>    8572&rank=1<http://citeseerx.**ist.psu.edu/viewdoc/summary?**
>> doi=10.1.1.13.8572&rank=1<http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.13.8572&rank=1>
>>    <http://citeseerx.ist.psu.edu/**viewdoc/summary?doi=10.1.1.13.**
>> 8572&rank=1<http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.13.8572&rank=1>
>> >>
>>     >>
>>     >> On Tue, Jun 28, 2011 at 2:43 PM, Sebastian Schelter (JIRA)
>>     >> <[email protected] <mailto:[email protected]>>**wrote:
>>
>>     >>
>>     >>
>>     >>>    [
>>     >>> 
>> https://issues.apache.org/****jira/browse/MAHOUT-746?page=**<https://issues.apache.org/**jira/browse/MAHOUT-746?page=**>
>>     >>> com.atlassian.jira.plugin.****system.issuetabpanels:comment-****
>>     >>>
>>    tabpanel&focusedCommentId=****13056805#comment-13056805<http**
>> s://issues.apache.org/jira/**browse/MAHOUT-746?page=com.**
>> atlassian.jira.plugin.system.**issuetabpanels:comment-**
>> tabpanel&focusedCommentId=**13056805#comment-13056805<https://issues.apache.org/jira/browse/MAHOUT-746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13056805#comment-13056805>
>>    <https://issues.apache.org/**jira/browse/MAHOUT-746?page=**
>> com.atlassian.jira.plugin.**system.issuetabpanels:comment-**
>> tabpanel&focusedCommentId=**13056805#comment-13056805<https://issues.apache.org/jira/browse/MAHOUT-746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13056805#comment-13056805>
>> >>
>>     >>> ]
>>     >>>
>>     >>> Sebastian Schelter commented on MAHOUT-746:
>>     >>> ------------------------------****-------------
>>     >>>
>>     >>> Thank you very much, Sean.
>>     >>>
>>     >>> I wonder whether there is some article/paper that describes this
>>     >>> particular
>>     >>> approach of implementing Naive Bayes? A colleague of mine with
>>    a much
>>     >>> deeper
>>     >>> statistics background and me took a look at the details of the
>>     >>> computation
>>     >>> today and we were left with some open questions.
>>     >>>
>>     >>>  Refactoring of the parallel Naive Bayes implementation in
>>     >>>>
>>     >>> org.apache.mahout.classifier.****naivebayes
>>     >>>
>>     >>>>
>>     >>>>  ------------------------------****----------------------------*
>> *--**
>>     >>> ------------------------------****-------
>>     >>>
>>     >>>>
>>     >>>>                 Key: MAHOUT-746
>>     >>>>                 URL:
>>    
>> https://issues.apache.org/****jira/browse/MAHOUT-746<https://issues.apache.org/**jira/browse/MAHOUT-746>
>> <https:/**/issues.apache.org/jira/**browse/MAHOUT-746<https://issues.apache.org/jira/browse/MAHOUT-746>
>> >
>>     >>>>             Project: Mahout
>>     >>>>          Issue Type: Improvement
>>     >>>>          Components: Classification
>>     >>>>    Affects Versions: 0.6
>>     >>>>            Reporter: Sebastian Schelter
>>     >>>>            Assignee: Sebastian Schelter
>>     >>>>             Fix For: 0.6
>>     >>>>
>>     >>>>         Attachments: MAHOUT-746.patch
>>     >>>>
>>     >>>>
>>     >>>> I refactored the code in
>>    org.apache.mahout.classifier.****naivebayes to
>>     >>>>
>>     >>> extend AbstractJob, decoupled the model serialization from the job
>>     >>> output,
>>     >>> extracted trainer classes and tried to clarify naming and
>>    reduce code
>>     >>> complexity. I also added tests for the training M/R code as
>>    well as a toy
>>     >>> integration test.
>>     >>>
>>     >>>> It would be great if someone could review my patch to make
>>    sure I didn't
>>     >>>>
>>     >>> break anything.
>>     >>>
>>     >>> --
>>     >>> This message is automatically generated by JIRA.
>>     >>> For more information on JIRA, see: http://www.atlassian.com/**
>>     >>> software/jira 
>> <http://www.atlassian.com/**software/jira<http://www.atlassian.com/software/jira>
>> >
>>     >>>
>>     >>>
>>     >>>
>>     >>>
>>     >>
>>     >
>>
>>
>>
>

Reply via email to