Re: [Neo4j] Neo4j DB > 32G Java queries get very slow

'Michael Hunger' via Neo4j Sat, 01 Jul 2017 03:40:13 -0700

It all *depends* on your queries.

Neo4j Enterprise has the compiled Cypher runtime, which depending on the
query can be 3-5 times faster.
It also has a new label index implementation which also speeds things up.


Without query examples / profile / data model etc. I can't give any
predictions.

Michael


On Sat, Jul 1, 2017 at 12:21 PM, unrealadmin23 via Neo4j <
[email protected]> wrote:

>
> Before queries, I try and warm things up before performing any timings:
>
> neo4j> call apoc.warmup.run() ;
>
> and
>
> neo4j> match (n:Entity) with n.name as name return count(*);
>
> Michael,    how much faster in real terms then is neo4j 3.2.1 over 3.2.0
> Enterprise and for which operations?
>
>
>
> On Saturday, 1 July 2017 01:41:03 UTC+1, Michael Hunger wrote:
>
>> *What does your query look like?*
>>
>> *How do you do this: "the query is threaded to use all cpu." ?*
>>
>> If it has to scan the whole dataset, depending on your memory config it
>> has to first load the data into memory, where you measure the performance
>> of your IO.
>> If the database is larger than memory it has to discard data and reload
>> it again which affects this again massively.
>>
>> Did you configure the page-cache in your neo4j.conf according to database
>> size? And set the heap to e.g. 16 or 32G ? Larger heaps shouldn't make a
>> difference.
>> *Page-Cache is what counts most*.
>>
>> Which Neo4j version are you using? I recommend 3.2.1 Enterprise which
>> comes for instance with compiled cypher runtime.
>>
>> Michael
>>
>>
>>
>> On Fri, Jun 30, 2017 at 10:10 PM, Patrice Loos <[email protected]> wrote:
>>
>>> I am testing a java query on different size dataset, 100 Million to 1
>>> Billion edges.
>>> The query does not return much data 10 to 20 vertices with corresponding
>>> edges but it need to scan the whole dataset.
>>> I can see a big performances degradation when the database size is
>>> bigger than 32 Gigs.
>>> I am running the test on a 32 core 244G RAM virtual server, the query is
>>> threaded to use all cpu.
>>> I changed the java heap size to 96G and played with the garbage
>>> collector options (retain -XX:+UseG1GC as the most improving option)
>>> to get a better outcome but I still get big dip in performances, I
>>> assumed the threshold is around 32G:
>>>
>>> 100M edges, database is 7.5G : 12 min
>>> 250M edges, database is 19G : 35 min
>>> 500M edges, database is 38G : 12 hours with -XX:+UseG1GC
>>> 1B edges, database is 76G : 51 hours without -XX:+UseG1GC
>>>
>>> Furthermore for the 0.5 Billion and 1 Billion test I can see that the
>>> bulk of the operations are system operations 60% versus
>>> user operation 40% (from top linux command). When I run the smaller test
>>> 100% of the operations are user operations.
>>>
>>> Are the java GC improvement in the Enterprise edition of Neo4j
>>> significant enough to bring the performance of the large scale dataset
>>> query in the same range as the smaller one?
>>> Is there something else I can do to improve the performance of larger
>>> dataset queries?
>>>
>>> tks
>>> Patrice
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Neo4j" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] Neo4j DB > 32G Java queries get very slow

Reply via email to