[ 
https://issues.apache.org/jira/browse/MAHOUT-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Palumbo updated MAHOUT-1502:
-----------------------------------

    Attachment: MAHOUT-1502_draft.patch

Here's a patch for a draft of the reworked Naive Bayes page.  I was hoping to 
get some feedback on weather or not style and content-wise it's what you're 
looking for.

I've basically taken table 4 from Rennie and made a few minor changes to the 
steps 1,2  ("preprocessing") to reflect the TF-IDF transformations actually 
made by the lucene DefaultSimilarity class called from seq2sparse, and then 
gave a brief overview of the corresponding NB CLI commands.  I've made no 
mention of any java code except for its location. 

I probably need to rewrite the implementation section completely.    

A few questions i had:
1. Do we want to stick with "Bayes" and "CBayes"?  I've written it this way, 
but i think that they could be a little bit confusing.   

2. Should i provide a more thorough end to end explanation of  building a model 
from the command line?  I am thinking no since the 20 Newsgroups page has that. 
(I think that page also needs some work.  I'm not sure if there is a jira open 
for that).

3. Should there be a Java section on building a NB model? 


Also I'm not sure if for what i've called "preprocessing": steps 1-3 belong on 
this page.  I've left them in as the Rennie paper references them pretty 
heavily.   But they could be confusing things as they are more an issue for 
seq2sparse (which i'm increasingly thinking deserves its own page).  

Let me know of any changes that need to be made.
Thanks.   

> Update Naive Bayes Webpage to Current Implementation 
> -----------------------------------------------------
>
>                 Key: MAHOUT-1502
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1502
>             Project: Mahout
>          Issue Type: Bug
>          Components: Documentation
>    Affects Versions: 0.9
>            Reporter: Andrew Palumbo
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: MAHOUT-1502_draft.patch
>
>
> Current Naive Bayes page is for pre .7 NB implementation:
> https://mahout.apache.org/users/classification/bayesian.html
> post .7, TF-IDF calculations are preformed outside of NB.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to