Hi Paul
> My questions for the group are:
>
> 1)       Are there any suggestions for how subgraphs might be algorithmically 
> determined?  Are there "clustering" algorithms that might be leveraged?
>   
What you describe brings to my mind the field of "community detection" 
techniques, i.e. methods that identify groups of nodes on a graph that 
are more densely connected to each other than to the rest of the 
network. There is a real abundance of methods for solving this problem, 
each with its strengths and weaknesses. Perhaps the most widely known 
methods are the ones based on the concept of modularity (introduced by 
prof. M.E.J. Newman). Those methods, however, tend to produce 
communities that are highly skewed in size that is you get few gigantic 
communities and numerous smaller ones. There are also hierarchical 
methods that result in a community dendrogram and enable you to cut in a 
way such that communities have more balanced sizes. My personal 
favorites are variants of the SCAN method (by Xu et al., KDD 2007) which 
do not assign all graph nodes to communities and are scalable to graphs 
of the sizes you mention (million+ nodes, tens of millions of edges).

Best regards,
Symeon



_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to