Hi, I am importing data into Neo4j using the batch importer CLI. It has about a billion nodes and roughly 2B relationships. The nodes seemed to have loaded in an hour (if I am reading the output correctly) but now it is doing a node index and has been doing so for > 12 hours.
Nodes [>:23.39 MB/s---|PROPERTIE|NODE:|LAB|*v:37.18 MB/s---------------------------------------------] 1B Done in 1h 7m 18s 54ms Prepare node index [*SORT:11.52 GB--------------------------------------------------------------------------------]881M Any idea why it is so slow? I don't think it is meant to be? I can think of a few things to speed up (which I am trying in another instance) but want to get feedback from folks: 1. Split up the creation of the nodes and relationships into 2 separate commands 2. Create indexes after the creation of the nodes 3. Do match/merge to get rid of duplicates 4. Run the import for relationships. Do you think this will help? If not, what am I doing wrong? Thanks Neha. PS: My heap size is large(ish) -- around 25G. Is that maybe an issue? -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
