Ok thanks. In that case this particular issue shouldn't happen in prod since we are only doing soft deletes there anyway. It's pretty inconvenient for testing though.
-- Ryan On Wednesday, November 12, 2014 11:43:13 AM UTC+11, Michael Hunger wrote: > > The problem is. > > You had 10M nodes in your db. > You deleted them all, you have 10M empty records on disk. > You don't restart. > You create a node, it is put in record 10.000.001 > So you have 10M empty records followed by one used record. > After that happened, a restart won't help you to relocate the node, just > to reuse the id's of the 10M deleted nodes. > If you restarted after the big delete, then the node would have been > created with record id 0. > > I wrote a tool that can take a store and copy it to compact it (currently > it doesn't change node-id's though) so this would only be useful for > compacting rels. > As if you change node-id's you also have to recreate indexes etc. > https://github.com/jexp/store-utils/tree/21 > > For your query, this is an all node scan, which goes over all records in > the db (and if they are in use loads them and counts them). > > For a real-world query you'd do that on a Label, like :Product or :Person. > Which should come back instantly even if you have millions of empty records. > > this: "Detected incorrectly shut down database, performing recovery.." is > just recovery after a hard kill or crash, which is ok, as the transactions > are written to and reapplied from the tx-log (WAL). > > > HTH MIchael > > > On Wed, Nov 12, 2014 at 1:11 AM, Ryan Sattler <[email protected] > <javascript:>> wrote: > >> Some further investigation suggests that *one* source of problems is >> non-indexed queries (eg "match (n) return count(n)") becoming very slow >> even on a near-empty database (eg taking 1000 milliseconds when there is >> only 1 node in there) after there has been a "performance meltdown" as >> described in my previous thread posted to this group. Again, this does not >> recover by restarting Neo, only by deleting the data. It seems that when >> the database is shutdown while there are stuck threads, there is some sort >> of DB corruption. I do get the "Detected incorrectly shut down database, >> performing recovery.." message on restart in this case but there doesn't >> seem to be any safe way to shut down? (I ctrl-c'd from console mode) >> >> (NB I think there are also other issues as I've seen indexed queries have >> problems too, but haven't been able to reproduce that one yet) >> >> -- >> Ryan Sattler >> >> >> >> On Wednesday, November 12, 2014 10:20:42 AM UTC+11, Ryan Sattler wrote: >>> >>> Hi, >>> >>> I've been developing an application using Neo4j (which will use an >>> Enterprise install in the final version). As part of this we run a large >>> number of integration tests against Neo. Each test deletes the existing >>> data using a cypher query, then reads and writes as needed. Normally this >>> works fine. However, a few times performance has catastrophically declined. >>> e.g., writing a single node to an empty database (normally taking a few >>> milliseconds) will start consistently taking (eg) 3 seconds. Restarting Neo >>> does not make any difference - the only fix I've found is to delete >>> graph.db, after which everything is back to normal. >>> >>> Obviously this is a serious concern because in a production environment >>> I can't just delete all our data. Any idea why this might be happening? And >>> regardless, is there any way to recover from this without losing data? If >>> not this seems like a major risk. >>> >>> We also had a similar issue in the past that seemed to be due to using >>> an accidentally non-indexed query. This caused the time of the query to >>> increase by about 2 seconds per attempt, even though there was the same >>> amount of data in the DB each time (data being deleted and re-written >>> between each test). Again, the only fix was deleting everything. This was >>> fixed by adding a proper index, but now a similar issue has occasionally >>> popped up on indexed queries as well. And at any rate, even though I'd >>> expect a non-indexed query to be slow, I wouldn't expect it's performance >>> to decay sharply over time even when the total data size is not increasing. >>> >>> Perhaps deletes may not be working correctly? >>> >>> Context: >>> >>> Neo4j 2.1.5 community edition >>> Linux >>> 2GB heap >>> SSD >>> Cypher/REST >>> >>> -- >>> Ryan >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "Neo4j" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
