Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/21081#discussion_r181847695
--- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala
---
@@ -312,6 +329,8 @@ class KMeans @Since("1.5.0") (
val handlePersistence = dataset.storageLevel == StorageLevel.NONE
val instances: RDD[OldVector] =
dataset.select(col($(featuresCol))).rdd.map {
case Row(point: Vector) => OldVectors.fromML(point)
+ case Row(point: Seq[_]) =>
+
OldVectors.fromML(Vectors.dense(point.asInstanceOf[Seq[Double]].toArray))
--- End diff --
I'm not sure this will work with arrays of FloatType. Make sure to test it
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]