[
https://issues.apache.org/jira/browse/SPARK-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15294588#comment-15294588
]
DB Tsai commented on SPARK-7159:
--------------------------------
Hello [~sethah],
I think we will make it as separate SoftmaxRegression or
MutinomialLogisticRegression class since they have different behavior when
pivoting. See
https://en.wikipedia.org/wiki/Multinomial_logistic_regression#As_a_set_of_independent_binary_regressions
for detail. As a result, in GLMNET, they have two different independent
implementation. In MLOR, people normally regularize the coefficients without
doing pivoting, as a result, you will have n * k coefficients where n is the
dimensions of features, and k is the number of classes. In binary LOR, by
default, the pivoting is performed, so we end up with n coefficients. Note
that you of course can do pivoting in MLOR, but choosing which class to pivot
will create different solutions, and that's why in MLOR, people don't pivot.
I already started to work on this, and if you have time to help, I'm willing to
give it to you, and help you to implement this. Let me know what you think.
Thanks.
> Support multiclass logistic regression in spark.ml
> --------------------------------------------------
>
> Key: SPARK-7159
> URL: https://issues.apache.org/jira/browse/SPARK-7159
> Project: Spark
> Issue Type: New Feature
> Components: ML
> Reporter: Joseph K. Bradley
> Assignee: DB Tsai
> Priority: Critical
>
> This should be implemented by checking the input DataFrame's label column for
> feature metadata specifying the number of classes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]