That's what I said. Use an effective cache (i.e. one of the primitive collection libraries with a map from long -> long)
Most memory efficient and performant way: Alternatively, what you do is to do a dual-pass. Create an Array of the expected sizes, add the key entries to the array. Sort the array The keys are entries of the array and the array-index is the node-id. you can scan the array for duplicates and null them out. And then you can use Arrays.binarySearch() to find your entries. This is quite efficient and similar to what Neo4j uses internally for neo4j-import. Michael > Am 29.03.2015 um 18:50 schrieb Alberto Jesús Rubio Sánchez > <[email protected]>: > > Hi Michael, > > I've been testing and my problem is that the file is very large and the > memory becomes full. > > For this reason I thought to use a cache to store the ids. If a node id isn't > in the cache, the node is inserted even if the node is in the database. > Finally look for duplicate nodes remaining to merge them. > > I think it may be a good solution. What do you think? > > Thanks, > Alberto. > > -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] > <mailto:[email protected]>. > For more options, visit https://groups.google.com/d/optout > <https://groups.google.com/d/optout>. -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
