[ 
https://issues.apache.org/jira/browse/SPARK-24217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

spark_user updated SPARK-24217:
-------------------------------
    Description: 
We should display prediction and id corresponding to all the nodes. 

As per the definition of PIC clustering, given in the code,

PIC takes an affinity matrix between items (or vertices) as input. An affinity 
matrix
 is a symmetric matrix whose entries are non-negative similarities between 
items.
 PIC takes this matrix (or graph) as an adjacency matrix. Specifically, each 
input row includes:
 * {{idCol}}: vertex ID
 * {{neighborsCol}}: neighbors of vertex in {{idCol}}
 * {{similaritiesCol}}: non-negative weights (similarities) of edges between 
the vertex
 in {{idCol}} and each neighbor in {{neighborsCol}}

 * *"PIC returns a cluster assignment for each input vertex."* It appends a new 
column {{predictionCol}}
 containing the cluster assignment in {{[0,k)}} for each row (vertex).

 Currently PIC will not return the cluster indices of neighbour IDs which are 
not there in the ID column.

  was:
We should display prediction and id corresponding to all the nodes.

As per the definition of PIC clustering, given in the code,

PIC takes an affinity matrix between items (or vertices) as input. An affinity 
matrix
is a symmetric matrix whose entries are non-negative similarities between items.
PIC takes this matrix (or graph) as an adjacency matrix. Specifically, each 
input row includes:
 * {{idCol}}: vertex ID
 * {{neighborsCol}}: neighbors of vertex in {{idCol}}
 * {{similaritiesCol}}: non-negative weights (similarities) of edges between 
the vertex
in {{idCol}} and each neighbor in {{neighborsCol}}

 * *"PIC returns a cluster assignment for each input vertex."* It appends a new 
column {{predictionCol}}
containing the cluster assignment in {{[0,k)}} for each row (vertex).

 


> Power Iteration Clustering is not displaying cluster indices corresponding to 
> some vertices.
> --------------------------------------------------------------------------------------------
>
>                 Key: SPARK-24217
>                 URL: https://issues.apache.org/jira/browse/SPARK-24217
>             Project: Spark
>          Issue Type: Bug
>          Components: ML
>    Affects Versions: 2.4.0
>            Reporter: spark_user
>            Priority: Major
>             Fix For: 2.4.0
>
>
> We should display prediction and id corresponding to all the nodes. 
> As per the definition of PIC clustering, given in the code,
> PIC takes an affinity matrix between items (or vertices) as input. An 
> affinity matrix
>  is a symmetric matrix whose entries are non-negative similarities between 
> items.
>  PIC takes this matrix (or graph) as an adjacency matrix. Specifically, each 
> input row includes:
>  * {{idCol}}: vertex ID
>  * {{neighborsCol}}: neighbors of vertex in {{idCol}}
>  * {{similaritiesCol}}: non-negative weights (similarities) of edges between 
> the vertex
>  in {{idCol}} and each neighbor in {{neighborsCol}}
>  * *"PIC returns a cluster assignment for each input vertex."* It appends a 
> new column {{predictionCol}}
>  containing the cluster assignment in {{[0,k)}} for each row (vertex).
>  Currently PIC will not return the cluster indices of neighbour IDs which are 
> not there in the ID column.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to