[
https://issues.apache.org/jira/browse/SPARK-24719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16530493#comment-16530493
]
Marco Gaido commented on SPARK-24719:
-------------------------------------
[~mengxr] I tried to pass integer values in the prediction column and I was not
able to reproduce any issue (I tried both distance measures). I also checked
the code and the prediction column is casted to double where needed. Can you
provide a repro if you faced any issue? If that is not the case, is this JIRA
meant for doing a small refactor which makes the casting more clear? Thanks.
> ClusteringEvaluator supports integer type labels
> ------------------------------------------------
>
> Key: SPARK-24719
> URL: https://issues.apache.org/jira/browse/SPARK-24719
> Project: Spark
> Issue Type: Bug
> Components: ML
> Affects Versions: 2.3.1
> Reporter: Xiangrui Meng
> Priority: Major
>
> ClusterEvaluator should support integer labels because we output integer
> labels in BisectingKMeans.
> [https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/clustering/BisectingKMeans.scala#L77].
> We should cast numeric types to double in ClusteringEvaluator.
> [~mgaido] Do you have time to work on the fix?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]