Hi, Johan. I've allocated 500M to the relationship store, so that's probably not the limitation (the current relationship store size on disk is about 100M).
My thought is that we are manipulating a lot of relationships (adding/deleting) within the transaction, and in fact, some (many) of the relationships that are added during the transaction are deleted during the same transaction and never actually saved. The scenario is the creation of an ordered linked list using nodes/relationships, and as each new item is "inserted", there are potentially 2-3 relationships that will be destroyed/created. In fact, if 5000 items are inserted, only 5002 relationships will be ultimately saved, although 15000+ will have been created in total, with 10000 of them being deleted. I'm not sure how to optimize that much further, though I'll look into it. I was considering using the Lucene index, but it does not have an obvious way to allow us to traverse from both the beginning and the end of the "index". Best, Rick -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Johan Svensson Sent: Tuesday, March 22, 2011 5:56 AM To: Neo4j user discussions Subject: Re: [Neo4j] Possible performance regression issue? Could you start by verifying it is not GC related. Turn on verbose GC and see if larger transactions trigger GC pause times. Another possible cause could be that the relationship store file has grown so configuration needs to be tweaked. The OS may be flushing pages to disk when it should not. There is a guide how to investigate and tweak that when running on Linux http://wiki.neo4j.org/content/Linux_Performance_Guide This could also be an issue with the setup of the persistence windows when not using memory mapped buffers. I remember those settings got tweaked some after 1.1 release. We could try make some changes there but it would be better to first perform some profiling before doing that. Regards, Johan On Mon, Mar 21, 2011 at 11:07 PM, Rick Bullotta <[email protected]> wrote: > Here's the quick summary of what we're encountering: > > We are inserting large numbers of activity stream entries on a nearly > constant basis. To optimize transactioning, we queue these up and have a > single scheduled task that reads the entries from the queue and persists them > to Neo. Within these transactions, it's possible that a very large number of > relationships will be created and deleted (sometimes create and deleted all > within the transaction, since we are managing something similar to an index). > I've noticed that the time required to handle the inserts (not just the > total, but the time per insert) degrades DRAMATICALLY if there are more than > a few hundred entries to write. It is very fast if there are < 100 entries > in the batch, but very slow if there are over > 1000. With Neo 1.1, we did > not notice this behavior. We have tried Neo 1.2 and 1.3 and both seem to > exhibit this behavior. > > Can anyone provide any insight into possible causes/fixes? > > Thanks, > > Rick _______________________________________________ Neo4j mailing list [email protected] https://lists.neo4j.org/mailman/listinfo/user _______________________________________________ Neo4j mailing list [email protected] https://lists.neo4j.org/mailman/listinfo/user

