Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/7284#issuecomment-121703584
@yanboliang My issue with "labels" is that it was design to allow
arbitrary Double labels, rather than 0-based indices. For spark.ml's
NaiveBayes, we can assume that labels have already been converted to use
0-based indices. Let's try to eliminate labels from the public API. For that,
I think we should modify pi and theta to use the original index ordering,
rather than the arbitrary one which mllib's NaiveBayes might choose. (We could
alternatively just make "labels" private, but the problem with that is that pi
and theta use that alternate label ordering.) Does that sound reasonable?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]