Repository: spark Updated Branches: refs/heads/master 1ff41d869 -> aa6db57e3
[SPARK-22399][ML] update the location of reference paper ## What changes were proposed in this pull request? Update the url of reference paper. ## How was this patch tested? It is comments, so nothing tested. Author: bomeng <[email protected]> Closes #19614 from bomeng/22399. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/aa6db57e Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/aa6db57e Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/aa6db57e Branch: refs/heads/master Commit: aa6db57e39d4931658089d9237dbf2a29acfe5ed Parents: 1ff41d8 Author: bomeng <[email protected]> Authored: Tue Oct 31 08:20:23 2017 +0000 Committer: Sean Owen <[email protected]> Committed: Tue Oct 31 08:20:23 2017 +0000 ---------------------------------------------------------------------- docs/mllib-clustering.md | 2 +- .../spark/examples/mllib/PowerIterationClusteringExample.scala | 3 ++- .../spark/mllib/clustering/PowerIterationClustering.scala | 6 +++--- python/pyspark/mllib/clustering.py | 2 +- 4 files changed, 7 insertions(+), 6 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/aa6db57e/docs/mllib-clustering.md ---------------------------------------------------------------------- diff --git a/docs/mllib-clustering.md b/docs/mllib-clustering.md index 8990e95..df2be92 100644 --- a/docs/mllib-clustering.md +++ b/docs/mllib-clustering.md @@ -134,7 +134,7 @@ Refer to the [`GaussianMixture` Python docs](api/python/pyspark.mllib.html#pyspa Power iteration clustering (PIC) is a scalable and efficient algorithm for clustering vertices of a graph given pairwise similarities as edge properties, -described in [Lin and Cohen, Power Iteration Clustering](http://www.icml2010.org/papers/387.pdf). +described in [Lin and Cohen, Power Iteration Clustering](http://www.cs.cmu.edu/~frank/papers/icml2010-pic-final.pdf). It computes a pseudo-eigenvector of the normalized affinity matrix of the graph via [power iteration](http://en.wikipedia.org/wiki/Power_iteration) and uses it to cluster vertices. `spark.mllib` includes an implementation of PIC using GraphX as its backend. http://git-wip-us.apache.org/repos/asf/spark/blob/aa6db57e/examples/src/main/scala/org/apache/spark/examples/mllib/PowerIterationClusteringExample.scala ---------------------------------------------------------------------- diff --git a/examples/src/main/scala/org/apache/spark/examples/mllib/PowerIterationClusteringExample.scala b/examples/src/main/scala/org/apache/spark/examples/mllib/PowerIterationClusteringExample.scala index 986496c..6560325 100644 --- a/examples/src/main/scala/org/apache/spark/examples/mllib/PowerIterationClusteringExample.scala +++ b/examples/src/main/scala/org/apache/spark/examples/mllib/PowerIterationClusteringExample.scala @@ -28,7 +28,8 @@ import org.apache.spark.mllib.clustering.PowerIterationClustering import org.apache.spark.rdd.RDD /** - * An example Power Iteration Clustering http://www.icml2010.org/papers/387.pdf app. + * An example Power Iteration Clustering app. + * http://www.cs.cmu.edu/~frank/papers/icml2010-pic-final.pdf * Takes an input of K concentric circles and the number of points in the innermost circle. * The output should be K clusters - each cluster containing precisely the points associated * with each of the input circles. http://git-wip-us.apache.org/repos/asf/spark/blob/aa6db57e/mllib/src/main/scala/org/apache/spark/mllib/clustering/PowerIterationClustering.scala ---------------------------------------------------------------------- diff --git a/mllib/src/main/scala/org/apache/spark/mllib/clustering/PowerIterationClustering.scala b/mllib/src/main/scala/org/apache/spark/mllib/clustering/PowerIterationClustering.scala index b2437b8..9444f29 100644 --- a/mllib/src/main/scala/org/apache/spark/mllib/clustering/PowerIterationClustering.scala +++ b/mllib/src/main/scala/org/apache/spark/mllib/clustering/PowerIterationClustering.scala @@ -103,9 +103,9 @@ object PowerIterationClusteringModel extends Loader[PowerIterationClusteringMode /** * Power Iteration Clustering (PIC), a scalable graph clustering algorithm developed by - * <a href="http://www.icml2010.org/papers/387.pdf">Lin and Cohen</a>. From the abstract: PIC finds - * a very low-dimensional embedding of a dataset using truncated power iteration on a normalized - * pair-wise similarity matrix of the data. + * <a href="http://www.cs.cmu.edu/~frank/papers/icml2010-pic-final.pdf">Lin and Cohen</a>. + * From the abstract: PIC finds a very low-dimensional embedding of a dataset using + * truncated power iteration on a normalized pair-wise similarity matrix of the data. * * @param k Number of clusters. * @param maxIterations Maximum number of iterations of the PIC algorithm. http://git-wip-us.apache.org/repos/asf/spark/blob/aa6db57e/python/pyspark/mllib/clustering.py ---------------------------------------------------------------------- diff --git a/python/pyspark/mllib/clustering.py b/python/pyspark/mllib/clustering.py index 91123ac..bb687a7 100644 --- a/python/pyspark/mllib/clustering.py +++ b/python/pyspark/mllib/clustering.py @@ -636,7 +636,7 @@ class PowerIterationClusteringModel(JavaModelWrapper, JavaSaveable, JavaLoader): class PowerIterationClustering(object): """ Power Iteration Clustering (PIC), a scalable graph clustering algorithm - developed by [[http://www.icml2010.org/papers/387.pdf Lin and Cohen]]. + developed by [[http://www.cs.cmu.edu/~frank/papers/icml2010-pic-final.pdf Lin and Cohen]]. From the abstract: PIC finds a very low-dimensional embedding of a dataset using truncated power iteration on a normalized pair-wise similarity matrix of the data. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
