[Neo4j] Re: Slow node creation/commits with unique constraint in 2.0.0

Moss Prescott Sat, 21 Dec 2013 12:04:57 -0800

So if I understand correctly, the constraint provides a stronger guarantee 
of uniqueness than I can get with either MERGE or doing some kind of check 
myself through the Java API. In my mind that makes it an interesting 
feature, so I'll look out for improvements in future releases.


Thanks,
- moss

On Saturday, December 21, 2013 11:21:07 AM UTC-7, Mark Needham wrote:
>
> Hi Moss,
>
> Yep you're right, I've noticed this on some benchmarks as well. We're 
> working on improving that - it should be a little slower than a normal 
> index but not 10x slower as in your tests.
>
> However, unless you might be concurrently creating those nodes on which 
> you put the unique constraint you might be able to work around it by using 
> a normal index 
>
> e.g. CREATE INDEX ON :Label(key)
>
> And then using  e.g. MERGE (l:Label {key: {key}}). That isn't thread safe 
> so if you concurrently ran that query and the node with key x hadn't been 
> created you might end up with two versions. However it would ensure that 
> you don't create the same node twice after the initial creation.
>
> Cheers
> Mark
>
> On Friday, 20 December 2013 18:36:05 UTC, Moss Prescott wrote:
>>
>> Hi,
>>
>> I am experimenting with 2.0.0 and notice that it is approx. an order of 
>> magnitude slower to add nodes when a unique constraint is defined, versus 
>> an index on the same label and property. I've attached a simple program 
>> that demonstrates the behavior, adding increasing numbers of simple, 
>> unconnected nodes, with either an index or a constraint.
>>
>> I'm pretty sure the index and constraint are working properly (in the 
>> sense that they improve query performance), based on other tests I've done.
>>
>> I'm getting org.neo4j:neo4j:2.0.0 from maven central, and running on jdk 
>> 1.7.0_40 on Fedora with no particular flags (but apparently max heap of 
>> 3GB).
>>
>> Here's the output of the test program:
>>
>> indexing...
>> waiting...
>> adding 1,000 nodes...
>> added...
>> committing...
>> ...done; 0.8s
>>
>> adding constraint...
>> adding 1,000 nodes...
>> added...
>> committing...
>> ...done; 3.3s
>>
>> indexing...
>> waiting...
>> adding 10,000 nodes...
>> added...
>> committing...
>> ...done; 2.9s
>>
>> adding constraint...
>> adding 10,000 nodes...
>> added...
>> committing...
>> ...done; 44.6s
>>
>> indexing...
>> waiting...
>> adding 100,000 nodes...
>> added...
>> committing...
>> ...done; 6.3s
>>
>> adding constraint...
>> adding 100,000 nodes...
>>
>> It runs for quite a while before even getting to the commit on the last 
>> transaction, at 100% CPU and with the heap climbing to about 1 GB.
>>
>> Maybe 100k is too many nodes in a single transaction? But as you can see 
>> it's more than 10x slower even at much more modest sizes. I also tried 
>> adding the same number of nodes, but spread across many smaller 
>> transactions, and it's still much slower with the constraint.
>>
>> Hopefully I'm doing something dumb here. Can anyone suggest a fix or 
>> confirm that this isn't working the way it should?
>>
>> Thanks,
>> - moss
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

[Neo4j] Re: Slow node creation/commits with unique constraint in 2.0.0

Reply via email to