[ 
https://issues.apache.org/jira/browse/MAHOUT-826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Eastman updated MAHOUT-826:
--------------------------------

    Fix Version/s:     (was: 0.6)
                   0.7

Moving this issue to 0.7 as there has been no activity in some time and it is 
not clearly relevant any more. Happy to move it back if it can  be resolved 
this week.
                
> Bayes/CBayes classification on a non-existing feature
> -----------------------------------------------------
>
>                 Key: MAHOUT-826
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-826
>             Project: Mahout
>          Issue Type: Bug
>          Components: Classification
>    Affects Versions: 0.5
>            Reporter: Andre-Philippe Paquet
>            Assignee: Robin Anil
>            Priority: Minor
>             Fix For: 0.7
>
>         Attachments: mahout-826.patch, mahout-826.patch
>
>
> (see http://comments.gmane.org/gmane.comp.apache.mahout.user/9597)
> Using CBayes or Bayes, when trying to classify a feature/word that doesn't 
> exist in the model, instead of returning the default/unknown label, the 
> algorithm returns all labels with a constant score (ex: 12.386649147018964). 
> After a quick look in CBayesAlgorithm, I found the problem in the 
> featureWeight function that returns the theta normalized weight even if the 
> feature didn't have any match (result=0).
> As a fix, I overrided the function in a subclass and return 0 if the weight 
> of the current feature in the current label is 0. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to