[ 
https://issues.apache.org/jira/browse/SPARK-15957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanbo Liang updated SPARK-15957:
--------------------------------
    Description: 
RFormula will index label only when it is string type. If the label is numeric 
type and we use RFormula to present a classification model, we can not extract 
label attributes from the label column metadata successfully. The label 
attributes are useful, so we can force to index label whether it is numeric or 
string type for classification. Then SparkR wrappers can extract label 
attributes from the column metadata successfully. This feature can help us to 
fix bug similar with SPARK-15153.
For regression, we will still to keep numeric type.
We should add a param to control whether to force to index label for RFormula.

  was:Add param to make users can force to index label whether it is numeric or 
string. For classification algorithms, we force to index label by setting it 
with true.


> RFormula supports forcing to index label
> ----------------------------------------
>
>                 Key: SPARK-15957
>                 URL: https://issues.apache.org/jira/browse/SPARK-15957
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML
>            Reporter: Yanbo Liang
>            Assignee: Yanbo Liang
>
> RFormula will index label only when it is string type. If the label is 
> numeric type and we use RFormula to present a classification model, we can 
> not extract label attributes from the label column metadata successfully. The 
> label attributes are useful, so we can force to index label whether it is 
> numeric or string type for classification. Then SparkR wrappers can extract 
> label attributes from the column metadata successfully. This feature can help 
> us to fix bug similar with SPARK-15153.
> For regression, we will still to keep numeric type.
> We should add a param to control whether to force to index label for RFormula.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to