My problem pattern is exactly the same as Niels's : A dense-node has millions of relations of a certain direction & type, and only a few (sparse) relations of a different direction and type. The traversing is usually following only those sparse relationships on those dense-nodes.
Now, even when traversing on these sparse relations, neo4j becomes extremely slow on a certainly non linear Order (the big cs O). Some tests I run (email me if u want the code) reveal that even the number of those dense-nodes in the database greatly influences the results. I just reported to Michael the runs with the latest M05 snapshot, which are not very positive... I have suggested an (auto) indexing of relationship types / direction that is used by traversing frameworks, but I ain't no graphdb-engine expert :-( A' Message: 5 > Date: Wed, 29 Jun 2011 18:19:10 +0200 > From: Niels Hoogeveen <[email protected]> > Subject: Re: [Neo4j] traversing densely populated nodes > To: <[email protected]> > Message-ID: <[email protected]> > Content-Type: text/plain; charset="iso-8859-1" > > > Michael, > > > > The issue I am refering to does not pertain to traversing many relations at > once > > but the impact many relationship of one type have on relationships > > of another type on the same node. > > > > Example: > > > > A topic class has 2 million outgoing relationships of type "HAS_INSTANCE" > and > > has 3 outgoing relationships of type "SUB_CLASS_OF". > > > > Fetching the 3 relations of type "SUB_CLASS_OF" takes very long, > > I presume due to the presence of the 2 million other relationships. > > > > I have no need to ever fetch the "HAS_INSTANCE" relationships from > > the topic node. That relation is always traversed from the other direction. > > > > I do want to know the class of a topic instance, leading to he topic class, > > but have no real interest ever to traverse all topic instance from the > topic > > class (at least not directly.. i do want to know the most recent addition, > > and that's what I use the timeline index for). > > > > Niels > > > > From: [email protected] > > Date: Wed, 29 Jun 2011 17:50:08 +0200 > > To: [email protected] > > Subject: Re: [Neo4j] traversing densely populated nodes > > > > I think this is the same problem that Angelos is facing, we are currently > evaluating options to improve the performance on those highly connected > supernodes. > > > > A traditional option is really to split them into group or even kind of > shard their relationships to a second layer. > > > > We're looking into storage improvement options as well as modifications > to retrieval of that many relationships at once. > > > > Cheers > > > > Michael > _______________________________________________ Neo4j mailing list [email protected] https://lists.neo4j.org/mailman/listinfo/user

