Hi Pedro,

I've not considered all options, but one thing I think I see is that it
looks like your transaction 'tx' is wrapping the entire for loop. This
means that you will use more and more memory until you start swapping. I
think you'll get into GC-hell, or ultimately a memory exception. Could you
consider commiting the transaction in the commitInterval block? Right now
you only log a message there, but it might be an ideal place to commit and
reopen the transaction.

The tx2 also seems unused, since it is within the main tx, so it should
have no effect and can be removed.

Regards, Craig


On Sat, Jul 26, 2014 at 6:34 PM, Pedro Santos <[email protected]> wrote:

> Hey guys,
>
> So I have around 70 million spatial records that i want to add to the
> spatial layer (I've tested with a small set and everything is smoothly,
> queries returning the same results as postgis and the layer operation seems
> fine)
> But when i try to add all the spatial records to the database, the
> performance degrades rapidly to the point that it gets really slow at
> around 5 million (around 2h running time) records and hangs at ~7.7 million
> (8 hours lapsed).
>
> Since the spatial index is an Rtree that uses the graph structure to
> construct itself, i am wondering why is it degrading when the number os
> records increase.
> Rtree insertions are O(n) if im not mistaken and thats why im concerned it
> might be something between the rearranging of bounding boxes, nodes that
> are not tree leaves that are causing the addToLayer process to get slower
> over time.
>
> Currently im adding nodes to the layer like that (lots of hardcoded stuff
> since im trying to figure out the problem before patterns and code style):
>
> Transaction tx = database.beginTx();
>>
>> try {
>>
>>
>>> ResourceIterable<Node> layerNodes =
>>> GlobalGraphOperations.at(database).getAllNodesWithLabel(label);
>>
>> long i = 0L;
>>
>> for (Node node : layerNodes) {
>>
>> Transaction tx2 = database.beginTx();
>>
>> try {
>>
>> layer.add(node);
>>
>> i++;
>>
>> if (i % commitInterval == 0) {
>>
>> log("indexing (" + i + " nodes added) ... time in seconds: "
>>
>> + (1.0 * (System.currentTimeMillis() - startTime) / 1000));
>>
>> }
>>
>> tx2.success();
>>
>> } finally {
>>
>> tx2.close();
>>
>> }
>>
>> }
>>
>> tx.success();
>>
>> } finally {
>>
>> tx.close();
>>
>> }
>>
>>
> Any thoughts ? Any ideas of how performance could be increased ?
>
> ps.: using java API
> Neo4j 2.1.2, Spatial 0.13
> Core i5 3570k @4.5Ghz, 32GB ram
> dedicated 2TB 7200 hard drive to the database (no OS, no virtual memory
> files, only the data itself)
>
> ps2.: All geometries are LineStrings (if thats important :P) they
> represent streets, roads, etc..
>
> ps3.: the nodes are already in the database, i only need to add them to
> the Layer so that i can perform spatial queries, bbox and wkb attributes
> are OK, tested and working for a small set.
>
> Thank you in advance
>
> --
> You received this message because you are subscribed to the Google Groups
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to