Hello,
I am starting to evaluate neo4j and I wrote some basic test application. The
results I get are not as fast as I would have wished :) but of course it is
probably because it is not optimized or simply because my case do not fit that
well with graph DB.
If somebody would have few minutes just to tell me if neo4j is a good choice
for
our application, here the story (sorry for the terminology I use, I m new to
graph DB in general)
- Two types of nodes, A and B
- The graph always alternate A and B nodes, path example : A -> B <- A -> B
(never A -> A -> B)
- Maximum depth of a path is 4, like in the previous example
- There are two different types of relationships
- The application needs to find all the paths between a given A node and a
given
B node, where the distance between the two nodes is at maximum, and where the
two first relationships are identical. So for instance I want to find all the
paths between A1 and B5 with the following graph:
A1 --Ra--> B1 <--Ra-- A2 --Ra--> B2
A1 --Ra--> B1 <--Ra-- A2 --Ra--> B5
A1 --Ra--> B2 <--Rb-- A3 --Ra--> B5
A1 --Rb--> B3 <--Rb-- A4 --Rb--> B4
A1 --Rb--> B3 <--Rb-- A4 --Ra--> B5
A1 --Rb--> B3 <--Rb-- A5 --Rb--> B5
So the end result would be
A1 --Ra--> B1 <--Ra-- A2 --Ra--> B5 (Same Ra relationships linking A1 and A2 to
B1)
A1 --Rb--> B3 <--Rb-- A4 --Ra--> B5 (Same Rb relationships linking A1 an A4 to
B3)
A1 --Rb--> B3 <--Rb-- A5 --Rb--> B5 (Same Rb relationships linking A1 an A5 to
B3)
The point is to compute the amount of final relationship types between a node A
and B. So in this case we have 2 Ra types and 1 Rb type between A1 and B5.
Hope this is clear enough :)
I run my test with the following data:
- 1000 A nodes
- 100 B nodes
- Random linking between A and B nodes. Between 20 and 50 relations ships for
each single A node to B nodes.
I made two different implementations:
1) Extend the AllPaths class from the graphalgo lib and modify the filter to
analyze the two first relationships
2) Similar to AllPaths implementation but with the use of an Evaluator to
analyze the two first relationships but also exclude wrong path faster
Once I get the paths I iterate them to compute the total of final Ra and Rb
relationships.
The result for all paths between two specifics nodes A and B returns 2573
different paths
Implementation (1) processes it in about 1150-1250ms
Implementation (2) processes it in about 950~1050ms
I tried to tweak the mapped memory config, but so far this is as good as I
manage to get it. I am using the embedded java version of neo4j.
I believe that the more nodes and relationships I will add the more time it
will
take to compute. So the big questions are:
- Does this scenario makes any sense under neo4j/GraphDB?
- Are the performances that I am getting normal considering the case or is
there
something I am missing or doing wrong?
Thank you for your time.
Regards
Nonak
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user