Hey folks,

So I have a smallish database with less than 5k nodes and about 22k 
relationships.  The admin page says the database is 25 MB.  I'm trying to 
write a query that will give me the list of all outgoing relationships from 
a node.  My query looks like:

MATCH (seed:Pool{name: 'poolname'}) -[:CALLS|USES*1..1]-> node RETURN 
COUNT(node)

This query takes 878ms and returns 12, which seems a tad slow, but that's 
fine.

To expand out, I do the following queries (all from the Neo4J web console):
MATCH (seed:Pool{name: 'poolname'}) -[:CALLS|USES*2..2]-> node RETURN 
COUNT(node) - takes 1146ms and returns 189
MATCH (seed:Pool{name: 'poolname'}) -[:CALLS|USES*3..3]-> node RETURN 
COUNT(node) - takes 4262ms and returns 747
MATCH (seed:Pool{name: 'poolname'}) -[:CALLS|USES*4..4]-> node RETURN 
COUNT(node) - takes 17700ms and returns 2816
MATCH (seed:Pool{name: 'poolname'}) -[:CALLS|USES*5..5]-> node RETURN 
COUNT(node) - takes 91736ms and returns 12055
MATCH (seed:Pool{name: 'poolname'}) -[:CALLS|USES*6..6]-> node RETURN 
COUNT(node) - takes 484238ms and returns 55375
MATCH (n) RETURN COUNT(n) returns 4845

So I have three questions:
1. It seems obvious that while Neo4J is traversing the graph that it is 
running into loops and counting some nodes twice.  Is there any way to stop 
it from doing that?
2. Even as it is looping, it seems like 17 seconds to traverse through 2816 
nodes is very slow, and my impression was that Neo4J was supposed to do 
this really quickly.  Am I expecting too much?  Is there anything obvious 
that you can see that I've done wrong?  I've included specs below of the VM 
that Neo4J is running on in case it provides helpful information.
3. Is there anything obvious that I'm missing that might be used to improve 
performance?

Neo4J is running on VM that has 12gigs of ram:
XXXXXXXXXXXXX:~$ free -m
             total       used       free     shared    buffers     cached
Mem:         12017       9429       2587          0         67        312
-/+ buffers/cache:       9048       2968
Swap:            0          0          0

The load average while the last query is running is: load average: 2.99, 
2.68, 2.17
The VM has 4 Intel(R) Core(TM)2 Duo CPU     T7700  @ 2.40GHz
I do have a lot of entries in the log like:

2014-11-06 06:23:35.604+0000 WARN  [o.n.k.EmbeddedGraphDatabase]: GC 
Monitor: Application threads blocked for 1042ms.

but not while running the last query.  So far as I know, that one query is 
the only query running, so there shouldn't even be heavy load on the 
database.

I'm not really a Java guy, but seeing other threads about performance I 
tweaked (or tried to) the heap configuration by setting values in 
neo4j-wrapper.conf

wrapper.java.initmemory=4096

wrapper.java.maxmemory=8192


Thanks in advance for any help/suggestions!

Ian


-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to