spark_user created SPARK-24217:
----------------------------------

             Summary: Power Iteration Clustering is not displaying cluster 
indices corresponding to some nodes.
                 Key: SPARK-24217
                 URL: https://issues.apache.org/jira/browse/SPARK-24217
             Project: Spark
          Issue Type: Bug
          Components: ML
    Affects Versions: 2.4.0
            Reporter: spark_user
             Fix For: 2.4.0


We should display prediction and id corresponding to all the nodes.

As per the definition of PIC clustering, given in the code,

PIC takes an affinity matrix between items (or vertices) as input. An affinity 
matrix
is a symmetric matrix whose entries are non-negative similarities between items.
PIC takes this matrix (or graph) as an adjacency matrix. Specifically, each 
input row includes:
 * {{idCol}}: vertex ID
 * {{neighborsCol}}: neighbors of vertex in {{idCol}}
 * {{similaritiesCol}}: non-negative weights (similarities) of edges between 
the vertex
in {{idCol}} and each neighbor in {{neighborsCol}}

 * *"PIC returns a cluster assignment for each input vertex."* It appends a new 
column {{predictionCol}}
containing the cluster assignment in {{[0,k)}} for each row (vertex).

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to