[
https://issues.apache.org/jira/browse/SPARK-15957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joseph K. Bradley updated SPARK-15957:
--------------------------------------
Shepherd: Joseph K. Bradley
Assignee: Yanbo Liang
> RFormula supports forcing to index label
> ----------------------------------------
>
> Key: SPARK-15957
> URL: https://issues.apache.org/jira/browse/SPARK-15957
> Project: Spark
> Issue Type: Improvement
> Components: ML
> Reporter: Yanbo Liang
> Assignee: Yanbo Liang
>
> RFormula will index label only when it is string type currently. If the label
> is numeric type and we use RFormula to present a classification model, there
> is no label attributes in label column metadata. The label attributes are
> useful when making prediction for classification, so we can force to index
> label by {{StringIndexer}} whether it is numeric or string type for
> classification. Then SparkR wrappers can extract label attributes from label
> column metadata successfully. This feature can help us to fix bug similar
> with SPARK-15153.
> For regression, we will still to keep label as numeric type.
> In this PR, we add a param indexLabel to control whether to force to index
> label for RFormula.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]