[GitHub] spark pull request: [SPARK-8600] [ML] Naive Bayes API for spark.ml...

jkbradley Wed, 15 Jul 2015 11:23:54 -0700

Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/7284#issuecomment-121703584
  
    @yanboliang  My issue with "labels" is that it was design to allow 
arbitrary Double labels, rather than 0-based indices.  For spark.ml's 
NaiveBayes, we can assume that labels have already been converted to use 
0-based indices.  Let's try to eliminate labels from the public API.  For that, 
I think we should modify pi and theta to use the original index ordering, 
rather than the arbitrary one which mllib's NaiveBayes might choose.  (We could 
alternatively just make "labels" private, but the problem with that is that pi 
and theta use that alternate label ordering.)  Does that sound reasonable?




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-8600] [ML] Naive Bayes API for spark.ml...

Reply via email to