Hi (or should I already say "Hi Michael Hunger" because you seem to answer
most questions),
We currently maintain a "directed acyclic graph" in SQL databases (the
usual products). For large graphs traversing is a problem. A typical
question is "Tell me all nodes with a certain property that have a
relationshop to a given node". In SQL we do a usual recursive depth-first
traversal. Already visited nodes are stored in white or black lists. So
typically this involves several thousands SQL statements and at the end you
habe also several thousands nodes in the white or black lists. The query
usually takes from a few seconds to say 30 seconds. The SQL server is
connected locally or through a LAN.
We would like to find out if a graph database like neo4j could be an
improvement here. So we did an import of one of our bigger databases into
neo4j Enterprise 2.1.5 evaluation. Here are the numbers: 3.2 M nodes (2
types), 24 M properties, 22 M relationships (1 uni-directed type), disk
usage 9,6 GB.
To our disappointment, the query times are comparable to our SQL results.
Sometimes they are 100% better, but in no way an order of magnitude, which
we hoped from what the neo4j manual promises ("Querying is performed
through traversals, which can perform millions of traversal steps per
second").
This is the query we use:
MATCH (assy:Assy {k_ebene: 'MODEL'})-[:HAS*..]->(mat:Mat {m_matnr:
"A4420380071"})
RETURN distinct assy.k_vari order by assy.k_vari limit 1000
Upfront indices have been created for the Assy and Mat nodes.
So far we use a default setup on a fast Windows 8 developer machine with a
lot memory.
Our question is, do you see any potential for a performance boost for us
here?
Thanks in advance!
Michael (yes also Michael)
--
You received this message because you are subscribed to the Google Groups
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.