Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/6737#discussion_r32395914
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala ---
@@ -479,6 +500,25 @@ object KMeans {
}
/**
+ * Trains a k-means model using the given set of parameters and initial
cluster centers
+ *
+ * @param data training points stored as `RDD[Vector]`
+ * @param k number of clusters
+ * @param maxIterations max number of iterations
+ * @param initialModel an existing set of cluster centers.
+ */
+ def train(
--- End diff --
I'm not sure at this point what the thinking is on adding yet another
overload to the utility method. At some point one is expected to use `KMeans`
directly, and I recall some move to stop adding these utility methods. But I am
not sure -- @mengxr @jkbradley any opinion?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]