Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/20578#discussion_r167563685
--- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala ---
@@ -158,18 +159,30 @@ class FPGrowth @Since("2.2.0") (
}
private def genericFit[T: ClassTag](dataset: Dataset[_]): FPGrowthModel
= {
+ val handlePersistence = dataset.storageLevel == StorageLevel.NONE
+
val data = dataset.select($(itemsCol))
- val items = data.where(col($(itemsCol)).isNotNull).rdd.map(r =>
r.getSeq[T](0).toArray)
+ val items = data.where(col($(itemsCol)).isNotNull).rdd.map(r =>
r.getSeq[Any](0).toArray)
--- End diff --
Yeah, probably a quirk of history. If it works it's fine.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]