Re: [Neo4j] Severe performance problems that can only be resolved by deleting all data?

Michael Hunger Tue, 11 Nov 2014 16:10:47 -0800

Hi Ryan,

I think the issue you run into is the fact that node and relationship
records that are freed during deletion are *not reused* during the same
uptime of the database.
It reuses records after a restart, so if you delete a lot of data,
restarting the db to enable reuse of the freed records helps.


This is a feature to be implemented in one of the next releases though.

These large chunks of non-used records are also what makes generic
(non-label based) scans of a store longer and less efficient (in terms of
mapping files to memory with large free chunks in them).

HTH,

Michael

On Wed, Nov 12, 2014 at 12:20 AM, Ryan Sattler <[email protected]> wrote:

> Hi,
>
> I've been developing an application using Neo4j (which will use an
> Enterprise install in the final version). As part of this we run a large
> number of integration tests against Neo. Each test deletes the existing
> data using a cypher query, then reads and writes as needed. Normally this
> works fine. However, a few times performance has catastrophically declined.
> e.g., writing a single node to an empty database (normally taking a few
> milliseconds) will start consistently taking (eg) 3 seconds. Restarting Neo
> does not make any difference - the only fix I've found is to delete
> graph.db, after which everything is back to normal.
>
> Obviously this is a serious concern because in a production environment I
> can't just delete all our data. Any idea why this might be happening? And
> regardless, is there any way to recover from this without losing data? If
> not this seems like a major risk.
>
> We also had a similar issue in the past that seemed to be due to using an
> accidentally non-indexed query. This caused the time of the query to
> increase by about 2 seconds per attempt, even though there was the same
> amount of data in the DB each time (data being deleted and re-written
> between each test). Again, the only fix was deleting everything. This was
> fixed by adding a proper index, but now a similar issue has occasionally
> popped up on indexed queries as well. And at any rate, even though I'd
> expect a non-indexed query to be slow, I wouldn't expect it's performance
> to decay sharply over time even when the total data size is not increasing.
>
> Perhaps deletes may not be working correctly?
>
> Context:
>
> Neo4j 2.1.5 community edition
> Linux
> 2GB heap
> SSD
> Cypher/REST
>
> --
> Ryan
>
> --
> You received this message because you are subscribed to the Google Groups
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] Severe performance problems that can only be resolved by deleting all data?

Reply via email to