Patrice,

I started off with a VM configuration similar to yourself.  I found a 
considerable speed up by ditching the VM and running native on the Linux 
platform.

Wayne


On Friday, 30 June 2017 21:43:12 UTC+1, Patrice Loos wrote:

> I am testing a java query on different size dataset, 100 Million to 1 
> Billion edges. 
> The query does not return much data 10 to 20 vertices with corresponding 
> edges but it need to scan the whole dataset.
> I can see a big performances degradation when the database size is bigger 
> than 32 Gigs.
> I am running the test on a 32 core 244G RAM virtual server, the query is 
> threaded to use all cpu.
> I changed the java heap size to 96G and played with the garbage collector 
> options (retain -XX:+UseG1GC as the most improving option) 
> to get a better outcome but I still get big dip in performances, I assumed 
> the threshold is around 32G:
>
> 100M edges, database is 7.5G : 12 min
> 250M edges, database is 19G : 35 min
> 500M edges, database is 38G : 12 hours with -XX:+UseG1GC 
> 1B edges, database is 76G : 51 hours without -XX:+UseG1GC 
>
> Furthermore for the 0.5 Billion and 1 Billion test I can see that the bulk 
> of the operations are system operations 60% versus 
> user operation 40% (from top linux command). When I run the smaller test 
> 100% of the operations are user operations.
>
> Are the java GC improvement in the Enterprise edition of Neo4j significant 
> enough to bring the performance of the large scale dataset query in the 
> same range as the smaller one?
> Is there something else I can do to improve the performance of larger 
> dataset queries?
>
> tks
> Patrice
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to