Re: MAHOUT-1369 - Why does theta normalization for naive bayes classification commented out?

Sebastian Schelter Fri, 28 Mar 2014 14:29:31 -0700

Great. The details how to submit a patch are here:

https://mahout.apache.org/developers/how-to-contribute.html


--sebastian

On 03/28/2014 09:29 PM, Chandler Burgess wrote:

Forgot to include in the last mail. Again, I do have the Rennie paper which 
I'll dig in to and see if I can fix it sometime in the near future. I'll also 
look at the problem with -seq flag to testnb.
All the guidelines for submitting patches are on JIRA or the mahout.apache.org 
pages, correct?

Chandler

-----Original Message-----
From: Chandler Burgess [mailto:[email protected]]
Sent: Friday, March 28, 2014 3:16 PM
To: [email protected]
Subject: RE: MAHOUT-1369 - Why does theta normalization for naive bayes 
classification commented out?

Well, maybe someone can correct me but this seems disappointing. I uncommented 
the code in NaiveBayesModel, BayesUtil and TrainNaiveBayesJob, added some trace 
statements in ComplementaryThetaMapper and ComplementaryNaiveBayesClassifier to 
verify they were being called, and then ran some tests using trainnb/testnb. 
There was not a single difference in the classifications when 
train/testcomplementary was specified vs standard naïve bayes.

Also, running testnb with the -seq flag doesn't appear to work.

-----Original Message-----
From: Chandler Burgess [mailto:[email protected]]
Sent: Thursday, March 27, 2014 5:17 PM
To: [email protected]
Subject: RE: MAHOUT-1369 - Why does theta normalization for naive bayes 
classification commented out?

The program I wrote didn't use a model that was trained with Cbayes. After 
looking at the scorers in SNB and CNB, I figured they would give different 
results even on a model not trained with CNB. That could very well be ignorance 
on my part as to the math.

However, I did some command line tests using -c on both training and testing 
and didn't see any difference in the testnb output.
________________________________________
From: Suneel Marthi <[email protected]>
Sent: Thursday, March 27, 2014 5:12 PM
To: [email protected]
Cc: [email protected]
Subject: Re: MAHOUT-1369 - Why does theta normalization for naive bayes 
classification commented out?

Just checking , u r testing Cbayes on a model that's already been trained using 
Cbayes correct?

Also the jira I mentioned earlier was fixed for .9, so u should be good. No 
code changes were done to naive bayes since .9


Sent from my iPhone

On Mar 27, 2014, at 6:01 PM, Chandler Burgess <[email protected]> wrote:

Ok, I'll uncomment those lines and see. I also have plenty of test data 
available  too (I'm doing document classification with unbalanced classes), so 
I'll see if it improves there as well.

Also, I'll try to make some time in the next week and go over the algorithm in 
detail compared with the paper as an extra check.

Thanks,
Chandler
________________________________________
From: Sebastian Schelter <[email protected]>
Sent: Thursday, March 27, 2014 4:01 PM
To: [email protected]
Subject: Re: MAHOUT-1369 - Why does theta normalization for naive bayes 
classification commented out?

Hi Chandler,

I think a good way to go would be to reenable theta normalization and
run the classification examples that we already have to see how it
affects the result (and make sure it improves the result).

Would be great to have this fixed. I'm also planning to port NB to our
Spark DSL very soon (should be just a few lines of code).

--sebastian

On 03/27/2014 09:07 PM, Suneel Marthi wrote:
Which Mahout version r u running? While its true that ThetaNormalizer is still 
disabled today, Mahout-1389 fixes a bug wherein Complementary NB wasn't being 
called when invoked.

Please test with Mahout 0.9 or trunk.




On Thursday, March 27, 2014 3:53 PM, Chandler Burgess 
<[email protected]> wrote:

Hello all,

It seems Robin Anil hasn't responded, and no one is sure of the status on this. 
What needs to be done on this, and/or what can I do to help? I'm no ML expert, 
but I do have the paper and should be able to verify/fix the implementation. 
I'm REALLY interested in using the CNB classifier, since it seems well suited 
to the problem I'm trying to tackle, before I give up and use something else.

I've done tests and see no difference when -c is passed on the command line for 
training or testing. I also wrote a program to print the scores using 
StandardNaiveBayesClassifier and ComplementaryNaiveBayesClassifier in a binary 
classification problem and see no difference between the scores, so it seems 
complementary naïve bayes is completely disabled.

Thanks,
Chandler Burgess

Re: MAHOUT-1369 - Why does theta normalization for naive bayes classification commented out?

Reply via email to