[jira] [Comment Edited] (SYSTEMML-700) Inflexible category labels for Multinomial Logistic Regression

Niketan Pansare (JIRA) Thu, 29 Sep 2016 11:38:10 -0700

    [ 
https://issues.apache.org/jira/browse/SYSTEMML-700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15533640#comment-15533640
 ]


Niketan Pansare edited comment on SYSTEMML-700 at 9/29/16 6:37 PM:
-------------------------------------------------------------------

[~JeremyNixon] As an FYI, the algorithm wrappers performs the label 
transformations that you suggested:  
http://apache.github.io/incubator-systemml/algorithms-classification.html#examples

The links to the code where this transformation is performed: 
https://github.com/apache/incubator-systemml/blob/master/src/main/scala/org/apache/sysml/api/ml/PredictionUtils.scala#L52


was (Author: niketanpansare):
As an FYI, the algorithm wrappers performs the label transformations that you 
suggested:  
http://apache.github.io/incubator-systemml/algorithms-classification.html#examples

The links to the code where this transformation is performed: 
https://github.com/apache/incubator-systemml/blob/master/src/main/scala/org/apache/sysml/api/ml/PredictionUtils.scala#L52

> Inflexible category labels for Multinomial Logistic Regression
> --------------------------------------------------------------
>
>                 Key: SYSTEMML-700
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-700
>             Project: SystemML
>          Issue Type: Bug
>          Components: Algorithms
>            Reporter: Jeremy
>            Priority: Minor
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> The Logistic Regression algorithm requires that category labels be labeled as 
> 0 up to the number of classes-1. It should be able to handle any set of 
> category labels provided by the user. B_out should have the appropriate size 
> regardless of the values of the labels given, and the algorithm should also 
> preserve the original labeling for the user.
> Added detail:
> The solution I'm currently using is to transform the labels from whatever 
> values they are to 0, 1, 2,... before hand, and then transform them back to 
> their original labels after the algorithm runs.
> Currently the algorithm doesn't handle class values that don't start at 0 or 
> 1, and doesn't handle non-contiguous integers, both of which can come up. For 
> example, the result for class labels 4,5,6 will return 5 sets of coefficients 
> (correct number should be 2), and class labels -1, 0, 1 returns just one set 
> of coefficients (correct number should be 2).
> Handling frames with strings would be a really great user experience - that 
> could look like R's coercion internally. Both glmnet and scikit-learn handle 
> string label arguments, but both apis are weakly typed as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (SYSTEMML-700) Inflexible category labels for Multinomial Logistic Regression

Reply via email to