[ https://issues.apache.org/jira/browse/SPARK-15957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15564876#comment-15564876 ]
Apache Spark commented on SPARK-15957: -------------------------------------- User 'yanboliang' has created a pull request for this issue: https://github.com/apache/spark/pull/15430 > RFormula supports forcing to index label > ---------------------------------------- > > Key: SPARK-15957 > URL: https://issues.apache.org/jira/browse/SPARK-15957 > Project: Spark > Issue Type: Improvement > Components: ML > Reporter: Yanbo Liang > Assignee: Yanbo Liang > Fix For: 2.1.0 > > > RFormula will index label only when it is string type currently. If the label > is numeric type and we use RFormula to present a classification model, there > is no label attributes in label column metadata. The label attributes are > useful when making prediction for classification, so we can force to index > label by {{StringIndexer}} whether it is numeric or string type for > classification. Then SparkR wrappers can extract label attributes from label > column metadata successfully. This feature can help us to fix bug similar > with SPARK-15153. > For regression, we will still to keep label as numeric type. > In this PR, we add a param indexLabel to control whether to force to index > label for RFormula. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org