Todd, what do you expect as an answer?
We're currently working in some areas that could benefit from this, but usually, nodes with such a high degree add zero information to your results, so you can just skip them and assume an "always hit". In general for pattern matching Cypher uses statistics to decide from which side of a (label)-[type]->(label) pair to expand and then runtime information for actually chosing the best side if both nodes are bound. Michael On Mon, Jan 30, 2017 at 12:15 PM, Todd Leo <[email protected]> wrote: > Hi Chris, > > Unfortunately in my case I cannot divide relationship/node into different > types/labels. I'm particularly interested in how Cypher use statistics to > try to work this out and affects the plan, could you explain in detail? > > BR, > Todd Leo > > 2017年1月30日 +0800 PM7:00 'Chris Vest' via Neo4j <[email protected] > >,写道: > > Nodes with such a high degree are called super-nodes, and traversals that > pass through them will likely experience degraded performance. Improving > the performance in these cases is an active area of graph database > research. Neo4j mitigates it a little bit by breaking the relationships of > high-degree nodes into groups by relationship type. Traversal through > high-cardinality groups is still going to be relatively slow, though, if > you are only interested in a few specific relationships, and not all of a > given type. Cypher is aware of this, and will use statistics to try and > plan around it. I don't know how well it works in practice; it probably > depends on the query and the structure of the data, as it usually does. > > -- > Chris Vest > System Engineer, Neo Technology > > > On 30 Jan 2017, at 10.33, [email protected] wrote: > > Hi, > > I know Neo4j works well on large graphs, under assumption of nodes are > generally equally distributed. However, in most cases, graphs in the real > world follow a scale-free degree distribution. My question is, if > relationship types and node labels are the same respectively, are there any > ways to remain speedy when querying through nodes with really high degree, > like 100k neighbor nodes? > > P.S. I also posted this on StackOverflow, except no one has answered yet: > http://stackoverflow.com/q/41832092/758413 > > --- > BR, > Todd Leo > > -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > > > -- > You received this message because you are subscribed to a topic in the > Google Groups "Neo4j" group. > To unsubscribe from this topic, visit https://groups.google.com/d/ > topic/neo4j/O1KFfVHlg88/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > [email protected]. > For more options, visit https://groups.google.com/d/optout. > > -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
