felixcheung commented on a change in pull request #23072:
[SPARK-19827][R]spark.ml R API for PIC
URL: https://github.com/apache/spark/pull/23072#discussion_r240492887
##########
File path: R/pkg/R/mllib_clustering.R
##########
@@ -610,3 +616,59 @@ setMethod("write.ml", signature(object = "LDAModel", path
= "character"),
function(object, path, overwrite = FALSE) {
write_internal(object, path, overwrite)
})
+
+#' PowerIterationClustering
+#'
+#' A scalable graph clustering algorithm. Users can call
\code{spark.assignClusters} to
+#' return a cluster assignment for each input vertex.
+#'
+# Run the PIC algorithm and returns a cluster assignment for each input
vertex.
+#' @param data a SparkDataFrame.
+#' @param k the number of clusters to create.
+#' @param initMode the initialization algorithm.
+#' @param maxIter the maximum number of iterations.
+#' @param sourceCol the name of the input column for source vertex IDs.
+#' @param destinationCol the name of the input column for destination vertex
IDs
+#' @param weightCol weight column name. If this is not set or \code{NULL},
+#' we treat all instance weights as 1.0.
+#' @param ... additional argument(s) passed to the method.
+#' @return A dataset that contains columns of vertex id and the corresponding
cluster for the id.
+#' The schema of it will be:
+#' \code{id: Long}
+#' \code{cluster: Int}
+#' @rdname spark.powerIterationClustering
+#' @aliases
assignClusters,PowerIterationClustering-method,SparkDataFrame-method
+#' @examples
+#' \dontrun{
+#' df <- createDataFrame(list(list(0L, 1L, 1.0), list(0L, 2L, 1.0),
+#' list(1L, 2L, 1.0), list(3L, 4L, 1.0),
+#' list(4L, 0L, 0.1)),
+#' schema = c("src", "dst", "weight"))
+#' clusters <- spark.assignClusters(df, initMode="degree", weightCol="weight")
Review comment:
space around `=` as style
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]