[jira] [Commented] (SPARK-26207) add PowerIterationClustering (PIC) doc in 2.4 branch

ASF GitHub Bot (JIRA) Mon, 10 Dec 2018 16:43:59 -0800


    [ 
https://issues.apache.org/jira/browse/SPARK-26207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16715854#comment-16715854
 ]


ASF GitHub Bot commented on SPARK-26207:
----------------------------------------

srowen closed pull request #23168: [SPARK-26207][doc]add 
PowerIterationClustering (PIC) doc in 2.4 branch
URL: https://github.com/apache/spark/pull/23168
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/ml-clustering.md b/docs/ml-clustering.md
index 1186fb73d0faf..d345512d2b8e8 100644
--- a/docs/ml-clustering.md
+++ b/docs/ml-clustering.md
@@ -265,3 +265,38 @@ Refer to the [R API 
docs](api/R/spark.gaussianMixture.html) for more details.
 </div>
 
 </div>
+
+## Power Iteration Clustering (PIC)
+
+Power Iteration Clustering (PIC) is  a scalable graph clustering algorithm
+developed by <a 
href=http://www.cs.cmu.edu/~frank/papers/icml2010-pic-final.pdf>Lin and 
Cohen</a>.
+From the abstract: PIC finds a very low-dimensional embedding of a dataset
+using truncated power iteration on a normalized pair-wise similarity matrix of 
the data.
+
+`spark.ml`'s PowerIterationClustering implementation takes the following 
parameters:
+
+* `k`: the number of clusters to create
+* `initMode`: param for the initialization algorithm
+* `maxIter`: param for maximum number of iterations
+* `srcCol`: param for the name of the input column for source vertex IDs
+* `dstCol`: name of the input column for destination vertex IDs
+* `weightCol`: Param for weight column name
+
+**Examples**
+
+<div class="codetabs">
+
+<div data-lang="scala" markdown="1">
+Refer to the [Scala API 
docs](api/scala/index.html#org.apache.spark.ml.clustering.PowerIterationClustering)
 for more details.
+
+{% include_example 
scala/org/apache/spark/examples/ml/PowerIterationClusteringExample.scala %}
+</div>
+
+<div data-lang="java" markdown="1">
+Refer to the [Java API 
docs](api/java/org/apache/spark/ml/clustering/PowerIterationClustering.html) 
for more details.
+
+{% include_example 
java/org/apache/spark/examples/ml/JavaPowerIterationClusteringExample.java %}
+</div>
+
+</div>
+


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> add PowerIterationClustering  (PIC) doc in 2.4 branch
> -----------------------------------------------------
>
>                 Key: SPARK-26207
>                 URL: https://issues.apache.org/jira/browse/SPARK-26207
>             Project: Spark
>          Issue Type: Documentation
>          Components: Documentation, ML
>    Affects Versions: 2.4.0
>            Reporter: Huaxin Gao
>            Assignee: Huaxin Gao
>            Priority: Minor
>             Fix For: 2.4.1
>
>
> add PIC documentation in docs/ml-clustering.md



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-26207) add PowerIterationClustering (PIC) doc in 2.4 branch

Reply via email to