Re: [Neo4j] Neo4J Batch Inserter is slow with big ids

Michael Hunger Sun, 15 Mar 2015 13:46:24 -0700

I would recommend that you check out the Neo4j-Import tool of Neo4j 2.2

Alterantively, what I do is to do a dual-pass.


Create an Array of the expected sizes, add the key entries to the array.
Sort the array
The keys are entries of the array and the array-index is the node-id.
you can scan the array for duplicates and null them out.
And then you can use Arrays.binarySearch() to find your entries.

This is quite efficient and similar to what Neo4j uses internally for 
neo4j-import.

Michael

> Am 15.03.2015 um 21:40 schrieb Alberto Jesús Rubio Sánchez 
> <[email protected]>:
> 
> Hi Michael,
> 
> Thanks for the reply :)
> 
> I used the map to keep the map identifiers but the data files are very large 
> and memory overflowed. I should be cleaning the map every X insertions and 
> then make a second pass to delete duplicates.
> 
> Perhaps the best option is to use the next version. What do you think?
> 
> Thanks again!
> 
> Regards,
> Alberto.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] 
> <mailto:[email protected]>.
> For more options, visit https://groups.google.com/d/optout 
> <https://groups.google.com/d/optout>.

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] Neo4J Batch Inserter is slow with big ids

Reply via email to