Can you share your queries and a query plan? Also do you have constraints for the merge properties?
Are any of the nodes you insert or connect to heavily contended? Michael Von meinem iPhone gesendet > Am 19.04.2016 um 15:23 schrieb [email protected]: > > We're loading 20 million abstracts into Neo4J. The rate has been about 2.5 > million abstracts per week. For comparison, we can load all 20 million > abstracts into Solr Cloud in less than 24 hours. The abstracts average about > 400-500 words each. For each abstract, we have 5 additional entity nodes > with a relationship between the abstract and these entities. We're looking > for any advice on speeding up the load times for Neo4J. > > In our attempts to get better performance when ingesting the abstracts, we > have tried combinations of py2neo version 2 and 3, and Neo4J version 2 and 3 > Enterprise. Our platform is a 2 processor 12 core Linux server with 32GB of > memory. We use the default Neo4J configuration. We prefer merge() to ensure > only one node per unique article ID but have tried create(). We utilize > batches of 1000 articles. We minimize round trips to the server with > transactions, first the entities and then the relationships. No find() or > find_one() calls are necessary. The script itself runs quickly and then > lingers during the commit suggesting the slowdown is coming from the Neo4J > server. During our trials, we discovered and reported that py2neo 3 hangs > indefinitely 99% of the time for merge() with the Bolt transaction. It also > hangs for the HTTP transactions, but its rare. > > Once we get a reasonable single-threaded ingestion rate, we can consider > running the load in parallel but since Neo4J is single threaded when updating > (correct?), we're not sure that will help much. > > Eventually we will be loading from a variety of sources in parallel so we > must avoid solutions that wipe the Neo4J database first. > > Has anyone else experience such slow load times? Is there some best > practices we've overloaded (other than writing directly in Java) that might > help increase load performance? > -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
