RE: MAHOUT-1369 - Why does theta normalization for naive bayes classification commented out?

Chandler Burgess Fri, 28 Mar 2014 13:30:51 -0700

Forgot to include in the last mail. Again, I do have the Rennie paper which 
I'll dig in to and see if I can fix it sometime in the near future. I'll also 
look at the problem with -seq flag to testnb.
All the guidelines for submitting patches are on JIRA or the mahout.apache.org 
pages, correct?

Chandler

-----Original Message-----
From: Chandler Burgess [mailto:[email protected]] 
Sent: Friday, March 28, 2014 3:16 PM
To: [email protected]
Subject: RE: MAHOUT-1369 - Why does theta normalization for naive bayes 
classification commented out?

Well, maybe someone can correct me but this seems disappointing. I uncommented 
the code in NaiveBayesModel, BayesUtil and TrainNaiveBayesJob, added some trace 
statements in ComplementaryThetaMapper and ComplementaryNaiveBayesClassifier to 
verify they were being called, and then ran some tests using trainnb/testnb. 
There was not a single difference in the classifications when 
train/testcomplementary was specified vs standard naïve bayes.

Also, running testnb with the -seq flag doesn't appear to work.

-----Original Message-----
From: Chandler Burgess [mailto:[email protected]]
Sent: Thursday, March 27, 2014 5:17 PM
To: [email protected]
Subject: RE: MAHOUT-1369 - Why does theta normalization for naive bayes 
classification commented out?

The program I wrote didn't use a model that was trained with Cbayes. After 
looking at the scorers in SNB and CNB, I figured they would give different 
results even on a model not trained with CNB. That could very well be ignorance 
on my part as to the math. 

However, I did some command line tests using -c on both training and testing 
and didn't see any difference in the testnb output.
________________________________________
From: Suneel Marthi <[email protected]>
Sent: Thursday, March 27, 2014 5:12 PM
To: [email protected]
Cc: [email protected]
Subject: Re: MAHOUT-1369 - Why does theta normalization for naive bayes 
classification commented out?

Just checking , u r testing Cbayes on a model that's already been trained using 
Cbayes correct?

Also the jira I mentioned earlier was fixed for .9, so u should be good. No 
code changes were done to naive bayes since .9

Sent from my iPhone

> On Mar 27, 2014, at 6:01 PM, Chandler Burgess <[email protected]> 
> wrote:
>
> Ok, I'll uncomment those lines and see. I also have plenty of test data 
> available  too (I'm doing document classification with unbalanced classes), 
> so I'll see if it improves there as well.
>
> Also, I'll try to make some time in the next week and go over the algorithm 
> in detail compared with the paper as an extra check.
>
> Thanks,
> Chandler
> ________________________________________
> From: Sebastian Schelter <[email protected]>
> Sent: Thursday, March 27, 2014 4:01 PM
> To: [email protected]
> Subject: Re: MAHOUT-1369 - Why does theta normalization for naive bayes 
> classification commented out?
>
> Hi Chandler,
>
> I think a good way to go would be to reenable theta normalization and 
> run the classification examples that we already have to see how it 
> affects the result (and make sure it improves the result).
>
> Would be great to have this fixed. I'm also planning to port NB to our 
> Spark DSL very soon (should be just a few lines of code).
>
> --sebastian
>
>
>> On 03/27/2014 09:07 PM, Suneel Marthi wrote:
>> Which Mahout version r u running? While its true that ThetaNormalizer is 
>> still disabled today, Mahout-1389 fixes a bug wherein Complementary NB 
>> wasn't being called when invoked.
>>
>> Please test with Mahout 0.9 or trunk.
>>
>>
>>
>>
>> On Thursday, March 27, 2014 3:53 PM, Chandler Burgess 
>> <[email protected]> wrote:
>>
>> Hello all,
>>
>> It seems Robin Anil hasn't responded, and no one is sure of the status on 
>> this. What needs to be done on this, and/or what can I do to help? I'm no ML 
>> expert, but I do have the paper and should be able to verify/fix the 
>> implementation. I'm REALLY interested in using the CNB classifier, since it 
>> seems well suited to the problem I'm trying to tackle, before I give up and 
>> use something else.
>>
>> I've done tests and see no difference when -c is passed on the command line 
>> for training or testing. I also wrote a program to print the scores using 
>> StandardNaiveBayesClassifier and ComplementaryNaiveBayesClassifier in a 
>> binary classification problem and see no difference between the scores, so 
>> it seems complementary naïve bayes is completely disabled.
>>
>> Thanks,
>> Chandler Burgess
>

RE: MAHOUT-1369 - Why does theta normalization for naive bayes classification commented out?

Reply via email to