I focused the same problem. Nodes with a lot of relationships are very
difficult (needs a lot of time) to be traversed. I solved the problem by
grouping the relationships using additional nodes. The dense node then
has only a few relationships to different 'group' nodes. Each 'group'
node then has again many relationships to other nodes.

Although this helps, it is a very ugly solution.

Best regards

Norbert Tausch


Am 29.06.2011 16:07, schrieb Niels Hoogeveen:
> Recently I have worked on loading the content of DbPedia into my database and 
> run into a performance issue.
> My application has a meta-layer; inspired by the meta model component, but 
> rewritten in Scala.
> All DbPedia resources are said to be an instance of "topic", 
> creating a relationship from that resource node to the node that describes 
> the topic class.
> This makes the "topic class" node of course densely populated.
> The "topic class" node has relationships other than "HAS_INSTANCE", 
> for example "SUB_CLASS_OF", which states that the "topic class" node is a 
> subclass of "typable". 
> When trying to retrieve the "SUB_CLASS_OF" relationships of the "topic class" 
> node performance degrades enormously. 
>
> It looks (please correct me if I am wrong in my assumption) as if all 
> relationships are being scanned 
> to filter out the "SUB_CLASS_OF" relationships (of which there are very few, 
> especially compared to the "HAS_INSTANCE" relationship)
> I ended up placing all "HAS_INSTANCE" into the Timeline index from 
> Neo4j-graph-collections for two reasons,it's nice to know when a resource 
> became an instance of a class (bonus), and to make sure that not a single 
> nodebecomes heavily populated.
> So far so good, but delving deeper into the Timeline index, I notice that the 
> relationship between an entry nodeand the root of the tree is partially 
> established by the use of a property on "entry node" which names the timeline 
> index.
> The simplest way to establish the relationship between an "entry node" and 
> the tree root is by means of a Lucene index lookup.
> This is of course not a very fastest solution and actually would mean the 
> same as adding a property to the "resource node", listing the classes a 
> resource is an instance of.
> Adding a relationship from "entry node" to "tree root" in the Timeline 
> component would create yet another densely populated nodein the database (in 
> this case the tree root). 
> Is there a way out of this situation? 
> Would it be possible to partition the relationships in the database per 
> relationship type per direction, so densely populated nodescan get traversed 
> fast for those relationships types that are sparsely populated?
> Niels
>                                         
> _______________________________________________
> Neo4j mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to