[jira] [Commented] (MAHOUT-1489) Interactive Scala Spark Bindings Shell Script processor

2014-03-27 Thread Dmitriy Lyubimov (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13948948#comment-13948948 ] Dmitriy Lyubimov commented on MAHOUT-1489: -- This issues not about it and this

[jira] [Commented] (MAHOUT-1489) Interactive Scala Spark Bindings Shell Script processor

2014-03-27 Thread Dmitriy Lyubimov (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13948952#comment-13948952 ] Dmitriy Lyubimov commented on MAHOUT-1489: -- yes no this is not the scope no

[jira] [Commented] (MAHOUT-1489) Interactive Scala Spark Bindings Shell Script processor

2014-03-27 Thread Dmitriy Lyubimov (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13948959#comment-13948959 ] Dmitriy Lyubimov commented on MAHOUT-1489: -- hm email quoting did not work . i

[jira] [Commented] (MAHOUT-1490) Data frame R-like bindings

2014-03-27 Thread Dmitriy Lyubimov (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13948962#comment-13948962 ] Dmitriy Lyubimov commented on MAHOUT-1490: -- could be, i have no opinion on

[jira] [Commented] (MAHOUT-1490) Data frame R-like bindings

2014-03-27 Thread Dmitriy Lyubimov (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13948974#comment-13948974 ] Dmitriy Lyubimov commented on MAHOUT-1490: -- could be, i have no opinion on

[jira] [Commented] (MAHOUT-1489) Interactive Scala Spark Bindings Shell Script processor

2014-03-27 Thread Dmitriy Lyubimov (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13948973#comment-13948973 ] Dmitriy Lyubimov commented on MAHOUT-1489: -- yes no this is not the scope no

[jira] [Commented] (MAHOUT-1489) Interactive Scala Spark Bindings Shell Script processor

2014-03-27 Thread Dmitriy Lyubimov (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13948978#comment-13948978 ] Dmitriy Lyubimov commented on MAHOUT-1489: -- bq. 1) Ability to execute against a

[jira] [Commented] (MAHOUT-1489) Interactive Scala Spark Bindings Shell Script processor

2014-03-27 Thread Dmitriy Lyubimov (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13948985#comment-13948985 ] Dmitriy Lyubimov commented on MAHOUT-1489: -- See

[jira] [Commented] (MAHOUT-1489) Interactive Scala Spark Bindings Shell Script processor

2014-03-27 Thread Dmitriy Lyubimov (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13948991#comment-13948991 ] Dmitriy Lyubimov commented on MAHOUT-1489: -- Hm. they actually copy-and-hack the

[jira] [Commented] (MAHOUT-1489) Interactive Scala Spark Bindings Shell Script processor

2014-03-27 Thread Saikat Kanjilal (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13949202#comment-13949202 ] Saikat Kanjilal commented on MAHOUT-1489: - I would vote to take this code and

[jira] [Comment Edited] (MAHOUT-1489) Interactive Scala Spark Bindings Shell Script processor

2014-03-27 Thread Saikat Kanjilal (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13949202#comment-13949202 ] Saikat Kanjilal edited comment on MAHOUT-1489 at 3/27/14 12:14 PM:

[jira] [Created] (MAHOUT-1492) Doap file has references to cwiki still

2014-03-27 Thread Ted Dunning (JIRA)
Ted Dunning created MAHOUT-1492: --- Summary: Doap file has references to cwiki still Key: MAHOUT-1492 URL: https://issues.apache.org/jira/browse/MAHOUT-1492 Project: Mahout Issue Type: Bug

[jira] [Commented] (MAHOUT-1492) Doap file has references to cwiki still

2014-03-27 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13949668#comment-13949668 ] Ted Dunning commented on MAHOUT-1492: - Also did a grep for cwiki and found a

[jira] [Commented] (MAHOUT-1492) Doap file has references to cwiki still

2014-03-27 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13949719#comment-13949719 ] Ted Dunning commented on MAHOUT-1492: - Committed trivial fixes. Doap file has

[jira] [Resolved] (MAHOUT-1492) Doap file has references to cwiki still

2014-03-27 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning resolved MAHOUT-1492. - Resolution: Fixed Doap file has references to cwiki still

[jira] [Commented] (MAHOUT-1492) Doap file has references to cwiki still

2014-03-27 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13949841#comment-13949841 ] Hudson commented on MAHOUT-1492: SUCCESS: Integrated in Mahout-Quality #2542 (See

MAHOUT-1369 - Why does theta normalization for naive bayes classification commented out?

2014-03-27 Thread Chandler Burgess
Hello all, It seems Robin Anil hasn't responded, and no one is sure of the status on this. What needs to be done on this, and/or what can I do to help? I'm no ML expert, but I do have the paper and should be able to verify/fix the implementation. I'm REALLY interested in using the CNB

Re: MAHOUT-1369 - Why does theta normalization for naive bayes classification commented out?

2014-03-27 Thread Suneel Marthi
Which Mahout version r u running? While its true that ThetaNormalizer is still disabled today, Mahout-1389 fixes a bug wherein Complementary NB wasn't being called when invoked. Please test with Mahout 0.9 or trunk. On Thursday, March 27, 2014 3:53 PM, Chandler Burgess

Re: MAHOUT-1369 - Why does theta normalization for naive bayes classification commented out?

2014-03-27 Thread Sebastian Schelter
Hi Chandler, I think a good way to go would be to reenable theta normalization and run the classification examples that we already have to see how it affects the result (and make sure it improves the result). Would be great to have this fixed. I'm also planning to port NB to our Spark DSL

[jira] [Created] (MAHOUT-1493) Port Naive Bayes to the Spark DSL

2014-03-27 Thread Sebastian Schelter (JIRA)
Sebastian Schelter created MAHOUT-1493: -- Summary: Port Naive Bayes to the Spark DSL Key: MAHOUT-1493 URL: https://issues.apache.org/jira/browse/MAHOUT-1493 Project: Mahout Issue Type:

[jira] [Updated] (MAHOUT-1493) Port Naive Bayes to the Spark DSL

2014-03-27 Thread Sebastian Schelter (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Schelter updated MAHOUT-1493: --- Attachment: MAHOUT-1493.patch preliminary patch. lacks preparation code and

RE: MAHOUT-1369 - Why does theta normalization for naive bayes classification commented out?

2014-03-27 Thread Chandler Burgess
Hi Suneel, I'm using 0.9. I did not train using Complementary NB, but was only using it for testing. I'm not real familiar with the math but can see CNBClassifier is scoring differently than SNBClassifier, so I thought I would see something, but the scores and results from testnb didn't

RE: MAHOUT-1369 - Why does theta normalization for naive bayes classification commented out?

2014-03-27 Thread Chandler Burgess
Ok, I'll uncomment those lines and see. I also have plenty of test data available too (I'm doing document classification with unbalanced classes), so I'll see if it improves there as well. Also, I'll try to make some time in the next week and go over the algorithm in detail compared with the

[jira] [Commented] (MAHOUT-1489) Interactive Scala Spark Bindings Shell Script processor

2014-03-27 Thread Saikat Kanjilal (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950004#comment-13950004 ] Saikat Kanjilal commented on MAHOUT-1489: - Initial github repo:

Re: MAHOUT-1369 - Why does theta normalization for naive bayes classification commented out?

2014-03-27 Thread Suneel Marthi
Just checking , u r testing Cbayes on a model that's already been trained using Cbayes correct? Also the jira I mentioned earlier was fixed for .9, so u should be good. No code changes were done to naive bayes since .9 Sent from my iPhone On Mar 27, 2014, at 6:01 PM, Chandler Burgess

RE: MAHOUT-1369 - Why does theta normalization for naive bayes classification commented out?

2014-03-27 Thread Chandler Burgess
The program I wrote didn't use a model that was trained with Cbayes. After looking at the scorers in SNB and CNB, I figured they would give different results even on a model not trained with CNB. That could very well be ignorance on my part as to the math. However, I did some command line

Re: [jira] [Created] (MAHOUT-1493) Port Naive Bayes to the Spark DSL

2014-03-27 Thread Dmitriy Lyubimov
so the only distributed use here is the colsums summaries, right? On Thu, Mar 27, 2014 at 2:16 PM, Sebastian Schelter (JIRA) j...@apache.org wrote: Sebastian Schelter created MAHOUT-1493: -- Summary: Port Naive Bayes to the Spark DSL

[jira] [Commented] (MAHOUT-1489) Interactive Scala Spark Bindings Shell Script processor

2014-03-27 Thread Dmitriy Lyubimov (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950273#comment-13950273 ] Dmitriy Lyubimov commented on MAHOUT-1489: -- I think it is a good start. (1) We

[jira] [Commented] (MAHOUT-1489) Interactive Scala Spark Bindings Shell Script processor

2014-03-27 Thread Saikat Kanjilal (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950282#comment-13950282 ] Saikat Kanjilal commented on MAHOUT-1489: - Answers embedded: (1) We probably

[jira] [Commented] (MAHOUT-1493) Port Naive Bayes to the Spark DSL

2014-03-27 Thread Dmitriy Lyubimov (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950283#comment-13950283 ] Dmitriy Lyubimov commented on MAHOUT-1493: -- I don't think you meant run() to

[jira] [Commented] (MAHOUT-1489) Interactive Scala Spark Bindings Shell Script processor

2014-03-27 Thread Dmitriy Lyubimov (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950290#comment-13950290 ] Dmitriy Lyubimov commented on MAHOUT-1489: -- yeah. reallistically,

[jira] [Comment Edited] (MAHOUT-1493) Port Naive Bayes to the Spark DSL

2014-03-27 Thread Dmitriy Lyubimov (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950283#comment-13950283 ] Dmitriy Lyubimov edited comment on MAHOUT-1493 at 3/28/14 2:30 AM:

[jira] [Commented] (MAHOUT-1493) Port Naive Bayes to the Spark DSL

2014-03-27 Thread Dmitriy Lyubimov (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950293#comment-13950293 ] Dmitriy Lyubimov commented on MAHOUT-1493: -- PS. it's an R naming style. R almost