Hey folks,
So I have a smallish database with less than 5k nodes and about 22k
relationships. The admin page says the database is 25 MB. I'm trying to
write a query that will give me the list of all outgoing relationships from
a node. My query looks like:
MATCH (seed:Pool{name: 'poolname'}) -[:CALLS|USES*1..1]-> node RETURN
COUNT(node)
This query takes 878ms and returns 12, which seems a tad slow, but that's
fine.
To expand out, I do the following queries (all from the Neo4J web console):
MATCH (seed:Pool{name: 'poolname'}) -[:CALLS|USES*2..2]-> node RETURN
COUNT(node) - takes 1146ms and returns 189
MATCH (seed:Pool{name: 'poolname'}) -[:CALLS|USES*3..3]-> node RETURN
COUNT(node) - takes 4262ms and returns 747
MATCH (seed:Pool{name: 'poolname'}) -[:CALLS|USES*4..4]-> node RETURN
COUNT(node) - takes 17700ms and returns 2816
MATCH (seed:Pool{name: 'poolname'}) -[:CALLS|USES*5..5]-> node RETURN
COUNT(node) - takes 91736ms and returns 12055
MATCH (seed:Pool{name: 'poolname'}) -[:CALLS|USES*6..6]-> node RETURN
COUNT(node) - takes 484238ms and returns 55375
MATCH (n) RETURN COUNT(n) returns 4845
So I have three questions:
1. It seems obvious that while Neo4J is traversing the graph that it is
running into loops and counting some nodes twice. Is there any way to stop
it from doing that?
2. Even as it is looping, it seems like 17 seconds to traverse through 2816
nodes is very slow, and my impression was that Neo4J was supposed to do
this really quickly. Am I expecting too much? Is there anything obvious
that you can see that I've done wrong? I've included specs below of the VM
that Neo4J is running on in case it provides helpful information.
3. Is there anything obvious that I'm missing that might be used to improve
performance?
Neo4J is running on VM that has 12gigs of ram:
XXXXXXXXXXXXX:~$ free -m
total used free shared buffers cached
Mem: 12017 9429 2587 0 67 312
-/+ buffers/cache: 9048 2968
Swap: 0 0 0
The load average while the last query is running is: load average: 2.99,
2.68, 2.17
The VM has 4 Intel(R) Core(TM)2 Duo CPU T7700 @ 2.40GHz
I do have a lot of entries in the log like:
2014-11-06 06:23:35.604+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC
Monitor: Application threads blocked for 1042ms.
but not while running the last query. So far as I know, that one query is
the only query running, so there shouldn't even be heavy load on the
database.
I'm not really a Java guy, but seeing other threads about performance I
tweaked (or tried to) the heap configuration by setting values in
neo4j-wrapper.conf
wrapper.java.initmemory=4096
wrapper.java.maxmemory=8192
Thanks in advance for any help/suggestions!
Ian
--
You received this message because you are subscribed to the Google Groups
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.