Hi, Johan.

I've allocated 500M to the relationship store, so that's probably not the 
limitation (the current relationship store size on disk is about 100M). 

My thought is that we are manipulating a lot of relationships (adding/deleting) 
within the transaction, and in fact, some (many) of the relationships that are 
added during the transaction are deleted during the same transaction and never 
actually saved.  The scenario is the creation of an ordered linked list using 
nodes/relationships, and as each new item is "inserted", there are potentially 
2-3 relationships that will be destroyed/created. In fact, if 5000 items are 
inserted, only 5002 relationships will be ultimately saved, although 15000+ 
will have been created in total, with 10000 of them being deleted.  I'm not 
sure how to optimize that much further, though I'll look into it.  I was 
considering using the Lucene index, but it does not have an obvious way to 
allow us to traverse from both the beginning and the end of the "index".

Best,

Rick


-----Original Message-----
From: [email protected] [mailto:[email protected]] On 
Behalf Of Johan Svensson
Sent: Tuesday, March 22, 2011 5:56 AM
To: Neo4j user discussions
Subject: Re: [Neo4j] Possible performance regression issue?

Could you start by verifying it is not GC related. Turn on verbose GC
and see if larger transactions trigger GC pause times.

Another possible cause could be that the relationship store file has
grown so configuration needs to be tweaked. The OS may be flushing
pages to disk when it should not. There is a guide how to investigate
and tweak that when running on Linux
http://wiki.neo4j.org/content/Linux_Performance_Guide

This could also be an issue with the setup of the persistence windows
when not using memory mapped buffers. I remember those settings got
tweaked some after 1.1 release. We could try make some changes there
but it would be better to first perform some profiling before doing
that.

Regards,
Johan

On Mon, Mar 21, 2011 at 11:07 PM, Rick Bullotta
<[email protected]> wrote:
> Here's the quick summary of what we're encountering:
>
> We are inserting large numbers of activity stream entries on a nearly 
> constant basis.  To optimize transactioning, we queue these up and have a 
> single scheduled task that reads the entries from the queue and persists them 
> to Neo.  Within these transactions, it's possible that a very large number of 
> relationships will be created and deleted (sometimes create and deleted all 
> within the transaction, since we are managing something similar to an index). 
>   I've noticed that the time required to handle the inserts (not just the 
> total, but the time per insert) degrades DRAMATICALLY if there are more than 
> a few hundred entries to write.  It is very fast if there are < 100 entries 
> in the batch, but very slow if there are over > 1000.  With Neo 1.1, we did 
> not notice this behavior.  We have tried Neo 1.2 and 1.3 and both seem to 
> exhibit this behavior.
>
> Can anyone provide any insight into possible causes/fixes?
>
> Thanks,
>
> Rick
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to