It all *depends* on your queries. Neo4j Enterprise has the compiled Cypher runtime, which depending on the query can be 3-5 times faster. It also has a new label index implementation which also speeds things up.
Without query examples / profile / data model etc. I can't give any predictions. Michael On Sat, Jul 1, 2017 at 12:21 PM, unrealadmin23 via Neo4j < [email protected]> wrote: > > Before queries, I try and warm things up before performing any timings: > > neo4j> call apoc.warmup.run() ; > > and > > neo4j> match (n:Entity) with n.name as name return count(*); > > Michael, how much faster in real terms then is neo4j 3.2.1 over 3.2.0 > Enterprise and for which operations? > > > > On Saturday, 1 July 2017 01:41:03 UTC+1, Michael Hunger wrote: > >> *What does your query look like?* >> >> *How do you do this: "the query is threaded to use all cpu." ?* >> >> If it has to scan the whole dataset, depending on your memory config it >> has to first load the data into memory, where you measure the performance >> of your IO. >> If the database is larger than memory it has to discard data and reload >> it again which affects this again massively. >> >> Did you configure the page-cache in your neo4j.conf according to database >> size? And set the heap to e.g. 16 or 32G ? Larger heaps shouldn't make a >> difference. >> *Page-Cache is what counts most*. >> >> Which Neo4j version are you using? I recommend 3.2.1 Enterprise which >> comes for instance with compiled cypher runtime. >> >> Michael >> >> >> >> On Fri, Jun 30, 2017 at 10:10 PM, Patrice Loos <[email protected]> wrote: >> >>> I am testing a java query on different size dataset, 100 Million to 1 >>> Billion edges. >>> The query does not return much data 10 to 20 vertices with corresponding >>> edges but it need to scan the whole dataset. >>> I can see a big performances degradation when the database size is >>> bigger than 32 Gigs. >>> I am running the test on a 32 core 244G RAM virtual server, the query is >>> threaded to use all cpu. >>> I changed the java heap size to 96G and played with the garbage >>> collector options (retain -XX:+UseG1GC as the most improving option) >>> to get a better outcome but I still get big dip in performances, I >>> assumed the threshold is around 32G: >>> >>> 100M edges, database is 7.5G : 12 min >>> 250M edges, database is 19G : 35 min >>> 500M edges, database is 38G : 12 hours with -XX:+UseG1GC >>> 1B edges, database is 76G : 51 hours without -XX:+UseG1GC >>> >>> Furthermore for the 0.5 Billion and 1 Billion test I can see that the >>> bulk of the operations are system operations 60% versus >>> user operation 40% (from top linux command). When I run the smaller test >>> 100% of the operations are user operations. >>> >>> Are the java GC improvement in the Enterprise edition of Neo4j >>> significant enough to bring the performance of the large scale dataset >>> query in the same range as the smaller one? >>> Is there something else I can do to improve the performance of larger >>> dataset queries? >>> >>> tks >>> Patrice >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Neo4j" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
