[jira] [Comment Edited] (SPARK-20043) CrossValidatorModel loader does not recognize impurity "Gini" and "Entropy" on ML random forest and decision. Only "gini" and "entropy" (in lower case) are accept

2017-03-23 Thread 颜发才

[ 
https://issues.apache.org/jira/browse/SPARK-20043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15939571#comment-15939571
 ] 

Yan Facai (颜发才) edited comment on SPARK-20043 at 3/24/17 2:15 AM:
--

The bug can be reproduced.

I'd like to work on it.


was (Author: facai):
The bug can be reproduced by:

```scala
  test("cross validation with decision tree") {
val dt = new DecisionTreeClassifier()
val dtParamMaps = new ParamGridBuilder()
  .addGrid(dt.impurity, Array("Gini", "Entropy"))
  .build()
val eval = new BinaryClassificationEvaluator
val cv = new CrossValidator()
  .setEstimator(dt)
  .setEstimatorParamMaps(dtParamMaps)
  .setEvaluator(eval)
  .setNumFolds(3)
val cvModel = cv.fit(dataset)

// copied model must have the same paren.
val cv2 = testDefaultReadWrite(cvModel, testParams = false)
  }
```

I'd like to work on it.

> CrossValidatorModel loader does not recognize impurity "Gini" and "Entropy" 
> on ML random forest and decision. Only "gini" and "entropy" (in lower case) 
> are accepted
> 
>
> Key: SPARK-20043
> URL: https://issues.apache.org/jira/browse/SPARK-20043
> Project: Spark
>  Issue Type: Bug
>  Components: ML
>Affects Versions: 2.1.0
>Reporter: Zied Sellami
>  Labels: starter
>
> I saved a CrossValidatorModel with a decision tree and a random forest. I use 
> Paramgrid to test "gini" and "entropy" impurity. CrossValidatorModel are not 
> able to load the saved model, when impurity are written not in lowercase. I 
> obtain an error from Spark "impurity Gini (Entropy) not recognized.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-20043) CrossValidatorModel loader does not recognize impurity "Gini" and "Entropy" on ML random forest and decision. Only "gini" and "entropy" (in lower case) are accept

2017-03-22 Thread yuhao yang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937269#comment-15937269
 ] 

yuhao yang edited comment on SPARK-20043 at 3/22/17 10:25 PM:
--

Looks like a bug for tree models load. a toLower should be added when loading 
impurityType from metadata. 
Ideally, we should also check for potential issues like this in other 
algorithms.


was (Author: yuhaoyan):
Looks like a bug for tree models load. a toLower should be added when loading 
impurityType from metadata. 

> CrossValidatorModel loader does not recognize impurity "Gini" and "Entropy" 
> on ML random forest and decision. Only "gini" and "entropy" (in lower case) 
> are accepted
> 
>
> Key: SPARK-20043
> URL: https://issues.apache.org/jira/browse/SPARK-20043
> Project: Spark
>  Issue Type: Bug
>  Components: ML
>Affects Versions: 2.1.0
>Reporter: Zied Sellami
>  Labels: starter
>
> I saved a CrossValidatorModel with a decision tree and a random forest. I use 
> Paramgrid to test "gini" and "entropy" impurity. CrossValidatorModel are not 
> able to load the saved model, when impurity are written not in lowercase. I 
> obtain an error from Spark "impurity Gini (Entropy) not recognized.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org