Github user shahidki31 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21274#discussion_r187234165
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/PowerIterationClustering.scala
---
@@ -231,8 +231,12 @@ class PowerIterationClustering private[clustering] (
dataset.schema($(idCol)).dataType match {
case _: LongType =>
uncastPredictions
+ case _: IntegerType =>
+ uncastPredictions.withColumn($(idCol),
col($(idCol)).cast(LongType))
--- End diff --
Shouldn't it be
` case _: IntegerType =>
+ uncastPredictions.withColumn($(idCol),
col($(idCol)).cast(IntegerType))
`
Otherwise it is not necessary for casting. right? Because prediction
already has id as Long type and dataset has id as IntegerType. So, we need to
cast prediction.id to IntegerType. right?
Correct me if I am wrong.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]