Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/20600#discussion_r171625505
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/BisectingKMeansModel.scala
---
@@ -155,34 +183,55 @@ object BisectingKMeansModel extends
Loader[BisectingKMeansModel] {
spark.createDataFrame(data).write.parquet(Loader.dataPath(path))
}
- private def getNodes(node: ClusteringTreeNode):
Array[ClusteringTreeNode] = {
- if (node.children.isEmpty) {
- Array(node)
- } else {
- node.children.flatMap(getNodes(_)) ++ Array(node)
- }
- }
-
- def load(sc: SparkContext, path: String, rootId: Int):
BisectingKMeansModel = {
+ def load(sc: SparkContext, path: String): BisectingKMeansModel = {
--- End diff --
This changed the signature of `load`, as MiMa notes. I'm not sure you can
do this? though I'm not so clear on the semantics of this serialization method.
Can it be avoided? I'd have thought the older serialization could be left as-is.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]