[Neo4j] Best practise for huge data import in neo4j

Liping Huang Tue, 21 Nov 2017 02:39:52 -0800

Hi there,

Can someone share some good suggestion on how to import/update huge data in 
neo4j, currently I need import almostly 200,000,000+ nodes and 200,000,000+ 
relationships data into neo4j, and need keep those data update to date, and 
due to the business requirements, I cannot do it in offline mode, so seems 
the LOAD CSV or Cypher CREATE/MERGE query is the only choose, but for PoC 
project, I get the performance issue 
<https://github.com/neo4j/neo4j/issues/10395>


In a 40 cores CPU, 30GB mem  and  SSD disk test server, the write 
performance is poor, it only get ~1000 nodes/s when using CREATE/MERGE 
cypher query and  ~5000 nodes/s when using LOAD CSV, and another big issue 
is only I try to update the nodes/relationships, neo4j server become 
*unavailable*.

Here is the sysinfo from Neo4j:


Here is the heap size and pagecache:

dbms.memory.heap.initial_size=12g
dbms.memory.heap.max_size=12g
dbms.memory.pagecache.size=14g

Here is the sample cypher:

CREATE CONSTRAINT ON (c:Company) ASSERT c.id IS UNIQUE;
CREATE CONSTRAINT ON (p:Person) ASSERT p.id IS UNIQUE;



USING PERIODIC COMMIT 10000
LOAD CSV FROM 'file:///persop-{number}.csv' AS row
MERGE (person:Person { id: row[0] })
ON CREATE SET
    ......
ON MATCH SET
    ......

USING PERIODIC COMMIT 10000
LOAD CSV FROM 'file:///company-{number}.csv' AS row
MERGE (company:Company { id: row[0] })
ON CREATE SET
    ......
ON MATCH SET
    ......

USING PERIODIC COMMIT 10000
LOAD CSV FROM 'file:///person-legal-company-{number}.csv' AS row
MATCH (c:Person { id: row[0] })
MATCH (p:Company { id: row[1] })
MERGE (c)-[r:REL]->(p)
ON CREATE SET
    ......
ON MATCH SET
    ......


Waiting for your help, thanks in advance.

Rgards.


-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to neo4j+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[Neo4j] Best practise for huge data import in neo4j

Reply via email to