[ 
https://issues.apache.org/jira/browse/SPARK-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15294588#comment-15294588
 ] 

DB Tsai commented on SPARK-7159:
--------------------------------

Hello [~sethah],

I think we will make it as separate SoftmaxRegression or 
MutinomialLogisticRegression class since they have different behavior when 
pivoting. See 
https://en.wikipedia.org/wiki/Multinomial_logistic_regression#As_a_set_of_independent_binary_regressions
 for detail. As a result, in GLMNET, they have two different independent 
implementation. In MLOR, people normally regularize the coefficients without 
doing pivoting, as a result, you will have n * k coefficients where n is the 
dimensions of features, and k is the number of classes. In binary LOR, by 
default, the pivoting is performed, so we end up with n  coefficients. Note 
that you of course can do pivoting in MLOR, but choosing which class to pivot 
will create different solutions, and that's why in MLOR, people don't pivot.

I already started to work on this, and if you have time to help, I'm willing to 
give it to you, and help you to implement this. Let me know what you think. 

Thanks.

> Support multiclass logistic regression in spark.ml
> --------------------------------------------------
>
>                 Key: SPARK-7159
>                 URL: https://issues.apache.org/jira/browse/SPARK-7159
>             Project: Spark
>          Issue Type: New Feature
>          Components: ML
>            Reporter: Joseph K. Bradley
>            Assignee: DB Tsai
>            Priority: Critical
>
> This should be implemented by checking the input DataFrame's label column for 
> feature metadata specifying the number of classes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to