Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/2137#issuecomment-67246219
@BigCrunsh Could we please discuss our respective interfaces to converge
on some standard naming conventions? Iâd like to get your PR merged to
update the main spark.mllib package, but also make sure it matches the
experimental spark.ml package as much as possible. There are really only a few
items to decide.
My votes on naming classes & methods:
* BinaryClassificationModel -> BinaryClassificationGLM (more precise)
* predictClass -> predict
* I vote for âpredict()â always meaning predict the label (class for
classification, or real value for regression). That seems more standard (e.g.,
scikit-learn uses this convention).
* predictScore -> predictRaw?
* âscoreâ is a very overloaded term, and ârawâ might be more
intuitive.
+1 for ProbabilisticClassificationModel. But the current version sounds
specific to binary classification. Would you want to rename it to
ProbabilisticBinaryClassificationModel, or generalize it to return the
probability for each possible label? (Iâm doing the latter in my PR, using
[predictProbabilities()](https://github.com/jkbradley/spark/blob/ml-api-part1/mllib/src/main/scala/org/apache/spark/ml/impl/estimator/ProbabilisticClassificationModel.scala).)
After we settle on these items, Iâd like to make a detailed pass over
your PR. Thanks in advance!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]