[GitHub] spark pull request #21119: [SPARK-19826][ML][PYTHON]add spark.ml Python API ...

viirya Fri, 27 Apr 2018 21:12:36 -0700

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21119#discussion_r184839158
  
    --- Diff: python/pyspark/ml/clustering.py ---
    @@ -1156,6 +1156,204 @@ def getKeepLastCheckpoint(self):
             return self.getOrDefault(self.keepLastCheckpoint)
     
     
    +@inherit_doc
    +class PowerIterationClustering(HasMaxIter, HasPredictionCol, 
JavaTransformer, JavaParams,
    +                               JavaMLReadable, JavaMLWritable):
    +    """
    +    .. note:: Experimental
    +    Power Iteration Clustering (PIC), a scalable graph clustering 
algorithm developed by
    +    <a href=http://www.icml2010.org/papers/387.pdf>Lin and Cohen</a>. From 
the abstract:
    +    PIC finds a very low-dimensional embedding of a dataset using 
truncated power
    +    iteration on a normalized pair-wise similarity matrix of the data.
    +
    +    PIC takes an affinity matrix between items (or vertices) as input.  An 
affinity matrix
    +    is a symmetric matrix whose entries are non-negative similarities 
between items.
    +    PIC takes this matrix (or graph) as an adjacency matrix.  
Specifically, each input row
    +    includes:
    +
    +     - :py:class:`idCol`: vertex ID
    +     - :py:class:`neighborsCol`: neighbors of vertex in :py:class:`idCol`
    +     - :py:class:`similaritiesCol`: non-negative weights (similarities) of 
edges between the
    +        vertex in :py:class:`idCol` and each neighbor in 
:py:class:`neighborsCol`
    +
    +    PIC returns a cluster assignment for each input vertex.  It appends a 
new column
    +    :py:class:`predictionCol` containing the cluster assignment in 
:py:class:`[0,k)` for
    +    each row (vertex).
    +
    +    Notes:
    --- End diff --
    
    Use `.. note::`?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21119: [SPARK-19826][ML][PYTHON]add spark.ml Python API ...

Reply via email to