I've made some progress on this. I've added a timeout to the config as described here: http://www.markhneedham.com/blog/2013/10/17/neo4j-setting-query-timeout/ . This seems to keep the DB responsive even after there have been some timeouts. (Previously I'd added the 'max-execution-time' header but that didn't seem to help). Theoretically I don't think I have any queries that should be ultra-long-running in the first place (ie nothing that touches the whole DB) but at least having the timeout seems to mitigate the problem.
Unfortunately I don't think I could easily replicate what our code is doing in a simple script, it's pretty complex in terms of what goes on in each transaction (lots of reads and writes). Thanks, Ryan On Wednesday, November 12, 2014 11:45:47 AM UTC+11, Michael Hunger wrote: > > Those locks _should_ resolve themselves. And it is not ok that they don't. > > Could you write a script that reproduces the scenario? That would be very > helpful for us to fix any underlying issue. > > > On Tue, Nov 11, 2014 at 11:49 PM, Ryan Sattler <[email protected] > <javascript:>> wrote: > >> If we delete the data only, the indexes are still defined (eg they show >> up with :schema), so data coming from other sources will still be indexed >> correctly. If we delete graph.db, the indexes are gone and have to be >> re-specified somehow or future reads will have bad performance. We have an >> endpoint to do this but the overall complexity of >> stop-server-delete-files-restart-server-rebuild-indexes is relatively high. >> What we really want is just a "truncate database" query that simply deletes >> all data without acquiring locks while keeping index definitions. This >> would be very useful for testing compared to cobbling together some >> relatively complex delete script. >> >> If your own internal testing of Neo is doing a lot of datafile deletes I >> suspect you might be overlooking a crippling performance issue that can >> occur if one doesn't do this and I will make a separate post about that. >> >> At any rate thanks for the advice and we will have a think to see if we >> can do larger transactions. Despite all this I'm still concerned about the >> behaviour of Neo in these cases - it just doesn't seem right that these >> locks get stuck and don't resolve themselves in a reasonable amount of >> time, even with transaction timeouts of a few seconds? >> >> -- >> Ryan >> >> On Tuesday, November 11, 2014 5:38:45 PM UTC+11, Michael Hunger wrote: >>> >>> I think the queries timing out are just an indication of all processing >>> threads blocked on locks, and then adding delete queries on top of that >>> doesn't make it better. >>> >>> But if you delete the data anyway the indexes will be emptied as well? >>> >>> Addressing the root cause of not getting locks, is the solution here: >>> - larger grained, grouped tx >>> - queuing and grouping / aggregating of requests (you can easily have >>> 30k-50k ops in one requests with your memory setup) >>> >>> Michael >>> >>> On Tue, Nov 11, 2014 at 3:10 AM, Ryan Sattler <[email protected]> wrote: >>> >>>> Yes it's faster to do that, but that also deletes indexes so it's more >>>> convenient to just delete the contents. At any rate it seems like there is >>>> a problem if once a few queries time out, it becomes seemingly permanently >>>> impossible to do a delete request - who knows what other queries will be >>>> broken or slow too? IE, my concern is not so much with the delete query >>>> specifically as that Neo is getting into a bad state and not recovering, >>>> which could be bad if it happened in production. >>>> >>>> Thanks for the suggestion on deleting relationships first. We'll also >>>> keep the number of threads at/below the number of cores. We've reduced >>>> deadlocks by sorting some of the nodes being written but we also need to >>>> handle retries for constraint violations (due to two transactions writing >>>> the same unique node at the same time). >>>> >>>> We are actually already doing a certain amount of grouping for >>>> correctness reasons rather than performance, but I don't think we can do >>>> what's suggested in that blog post because for correctness we need to >>>> write >>>> whole trees of many types of nodes and with possible >>>> relationships-to/reuse-of existing nodes as a single transaction, and it's >>>> hard to do two different trees in one transaction. >>>> >>>> On Tuesday, November 11, 2014 11:41:59 AM UTC+11, Michael Hunger wrote: >>>>> >>>>> Hi Ryan, >>>>> >>>>> so are you deleting the whole db? >>>>> >>>>> Wouldn't it be faster to shutdown the server clean out the graph.db >>>>> directory and restart? >>>>> Also the id's of the deleted elements are not reused until the next >>>>> restart. >>>>> >>>>> Something I could imagine would help for your delete operation is to >>>>> separate deleting rels first >>>>> and then nodes. >>>>> >>>>> start r=rel(*) with r limit 100000 delete r return count(*); >>>>> >>>>> match (n) with n limit 100000 delete n return count(*); >>>>> >>>>> In general you might want to think about queuing and tagging by >>>>> "important-node" your requests and trying to group them by those tags, >>>>> Max >>>>> wrote a bit up about that: >>>>> http://maxdemarzi.com/2013/09/05/scaling-writes/ >>>>> >>>>> Reducing the number of threads to the number of cores you have should >>>>> also help to not use them all up while not progressing. >>>>> Some of your writes might fail in deadlocks and would have to be >>>>> retried. >>>>> >>>>> HTH >>>>> Michael >>>>> >>>>> >>>>> >>>>> On Tue, Nov 11, 2014 at 1:27 AM, Ryan Sattler <[email protected]> >>>>> wrote: >>>>> >>>>>> Here is the thread dump for a situation where isolated delete queries >>>>>> have started timing out: http://pastebin.com/u3Ecsp8i >>>>>> >>>>>> We do share some nodes quite heavily (ie they'll be involved in many >>>>>> transactions), so it could be a locking issue. I don't think we can >>>>>> really >>>>>> do all those in 1 transaction especially since when we're writing we >>>>>> don't >>>>>> know what already exists in the DB. >>>>>> >>>>>> On Tuesday, November 11, 2014 10:36:43 AM UTC+11, Michael Hunger >>>>>> wrote: >>>>>>> >>>>>>> I would have betted GC, but perhaps it was locking. So threads >>>>>>> waiting for each other to either finish writing the commit logs (which >>>>>>> is >>>>>>> synchronized) or waiting for other threads to release locks on nodes >>>>>>> that >>>>>>> are updated concurrently. Is there any means for you to group nodes >>>>>>> that >>>>>>> are updated somehow? >>>>>>> >>>>>>> The threaddump (with jstack <pid>) would show blocked threads it >>>>>>> would be _very_ valuable. >>>>>>> >>>>>>> 200 writes / tx sounds ok. If there was one write per tx I'd be more >>>>>>> worried. >>>>>>> In 2.2. we'll add internal tx write batching. >>>>>>> >>>>>>> Cheers, Michael >>>>>>> >>>>>>> >>>>>>> On Mon, Nov 10, 2014 at 11:24 PM, Ryan Sattler <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> I will try that next time that happens, but one time when I was >>>>>>>> able to cancel it normally (running as ./neo4j console), it took a >>>>>>>> while to >>>>>>>> stop and then had the following errors: >>>>>>>> >>>>>>>> 16:36:01.151 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-7727,5,main] >>>>>>>> 16:36:01.151 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-9861,5,main] >>>>>>>> 16:36:01.151 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10008,5,main] >>>>>>>> 16:36:01.151 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10024,5,main] >>>>>>>> 16:36:01.151 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10429,5,main] >>>>>>>> 16:36:01.151 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10730,5,main] >>>>>>>> 16:36:01.151 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10733,5,main] >>>>>>>> 16:36:01.151 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10741,5,main] >>>>>>>> 16:36:01.151 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10747,5,main] >>>>>>>> 16:36:01.151 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10748,5,main] >>>>>>>> 16:36:01.151 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10750,5,main] >>>>>>>> 16:36:01.151 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10751,5,main] >>>>>>>> 16:36:01.151 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10756,5,main] >>>>>>>> 16:36:01.151 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10757,5,main] >>>>>>>> 16:36:01.151 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10758,5,main] >>>>>>>> 16:36:01.151 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10759,5,main] >>>>>>>> 16:36:01.151 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10763,5,main] >>>>>>>> 16:36:01.152 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10766,5,main] >>>>>>>> 16:36:01.152 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10767,5,main] >>>>>>>> 16:36:01.152 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10770,5,main] >>>>>>>> 16:36:01.152 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10771,5,main] >>>>>>>> 16:36:01.152 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10772,5,main] >>>>>>>> 16:36:01.152 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10774,5,main] >>>>>>>> 16:36:01.152 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10775,5,main] >>>>>>>> 16:36:01.152 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10776,5,main] >>>>>>>> 16:36:01.152 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10777,5,main] >>>>>>>> 16:36:01.152 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10780,5,main] >>>>>>>> 16:36:01.152 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10781,5,main] >>>>>>>> 16:36:01.152 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10782,5,main] >>>>>>>> 16:36:01.152 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10783,5,main] >>>>>>>> 16:36:01.152 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10784,5,main] >>>>>>>> 16:36:01.152 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10785,5,main] >>>>>>>> 16:36:01.152 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10791,5,main] >>>>>>>> 16:36:01.153 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10792,5,main] >>>>>>>> 16:36:01.153 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10793,5,main] >>>>>>>> 16:36:01.153 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10794,5,main] >>>>>>>> 16:36:01.153 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10801,5,main] >>>>>>>> 16:36:01.153 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10803,5,main] >>>>>>>> 16:36:01.153 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10805,5,main] >>>>>>>> 16:36:01.153 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10806,5,main] >>>>>>>> 16:36:01.153 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10807,5,main] >>>>>>>> 16:36:01.153 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10808,5,main] >>>>>>>> 16:36:01.153 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10832,5,main] >>>>>>>> 16:36:01.153 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10837,5,main] >>>>>>>> 16:36:01.153 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10839,5,main] >>>>>>>> 16:36:01.153 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10840,5,main] >>>>>>>> 16:36:01.153 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10841,5,main] >>>>>>>> 16:36:01.153 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10844,5,main] >>>>>>>> 16:36:01.153 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10847,5,main] >>>>>>>> 16:36:01.154 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10852,5,main] >>>>>>>> 16:36:01.154 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10854,5,main] >>>>>>>> 16:36:01.154 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10862,5,main] >>>>>>>> 16:36:01.154 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10863,5,main] >>>>>>>> 16:36:01.154 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10864,5,main] >>>>>>>> 16:36:01.154 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10866,5,main] >>>>>>>> 16:36:01.154 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10870,5,main] >>>>>>>> 16:36:01.164 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10878,5,main] >>>>>>>> 16:36:01.164 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10880,5,main] >>>>>>>> 16:36:01.164 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10881,5,main] >>>>>>>> 16:36:01.164 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10883,5,main] >>>>>>>> 16:36:01.165 [Thread-192] WARN o.e.j.util.thread.QueuedThreadPool >>>>>>>> - qtp1198728399{STOPPING,8<=62<=80,i=0,q=0} Couldn't stop >>>>>>>> Thread[qtp1198728399-10885,5,main] >>>>>>>> >>>>>>>> It also gave the "Detected incorrectly shut down database, >>>>>>>> performing recovery.." error on restart even though it was shut down >>>>>>>> normally (by ctrl-c on the console process). >>>>>>>> >>>>>>>> I also realized that I forgot to include context for my question, >>>>>>>> so here we go: >>>>>>>> >>>>>>>> * Neo4j 2.1.5 >>>>>>>> * Cypher writes via REST >>>>>>>> * Up to 200 writes per transaction >>>>>>>> * Few hundred thousand total nodes >>>>>>>> * 2GB RAM allocated w/ CMS collector >>>>>>>> * Doesn't seem to be having garbage collection problems according >>>>>>>> to New Relic >>>>>>>> >>>>>>>> On Tuesday, November 11, 2014 3:27:43 AM UTC+11, Mark Needham wrote: >>>>>>>>> >>>>>>>>> What's the state of the system if you take a thread dump when it's >>>>>>>>> in unresponsive mood? >>>>>>>>> >>>>>>>>> On 10 November 2014 05:13, Ryan Sattler <[email protected]> wrote: >>>>>>>>> >>>>>>>>>> (I'm posting this question here since it seems a bit too >>>>>>>>>> non-specific for Stack Overflow). >>>>>>>>>> >>>>>>>>>> I've been working on a Neo4j system that needs to ingest a lot of >>>>>>>>>> data. We've been testing the data-load with many writing threads. >>>>>>>>>> Generally >>>>>>>>>> this works fine. For example, writing with 5 threads completes >>>>>>>>>> normally. >>>>>>>>>> Writing with 10 threads however works fine (in fact ~2x speed) for a >>>>>>>>>> while, >>>>>>>>>> but then suddenly every transaction starts timing out, even though >>>>>>>>>> CPU >>>>>>>>>> usage was generally less than 50%. >>>>>>>>>> >>>>>>>>>> Also, it doesn't fully recover for a long time if ever. Even >>>>>>>>>> minutes after the load has been completely turned off, Neo still >>>>>>>>>> remains >>>>>>>>>> unresponsive to certain kinds of queries - for example, I have a >>>>>>>>>> script >>>>>>>>>> that runs "MATCH (a) WITH a LIMIT 10000 OPTIONAL MATCH (a)-[r]-() >>>>>>>>>> DELETE >>>>>>>>>> a,r RETURN COUNT(*)" repeatedly which normally works fine when run >>>>>>>>>> by >>>>>>>>>> itself, but in this case it keeps timing out even when there are no >>>>>>>>>> other >>>>>>>>>> queries. I have to kill and restart the process. Needless to say >>>>>>>>>> this >>>>>>>>>> behaviour is less than ideal. I tried adding a "max-execution-time" >>>>>>>>>> header >>>>>>>>>> of 5 seconds but this didn't seem to help much if at all. >>>>>>>>>> >>>>>>>>>> In other tests I've seen weird spikes of bad performance 30 mins >>>>>>>>>> to a few hours after one of these meltdowns, in a kind of "echo", >>>>>>>>>> even when >>>>>>>>>> the load has been greatly reduced in the meantime (and when normally >>>>>>>>>> that >>>>>>>>>> load would sustain good performance indefinitely). >>>>>>>>>> >>>>>>>>>> Any idea what's going on? >>>>>>>>>> >>>>>>>>>> TL;DR: Neo fails abruptly under heavy load and seems slow to >>>>>>>>>> recover even when the load is removed. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Ryan Sattler >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>> Google Groups "Neo4j" group. >>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>>>> send an email to [email protected]. >>>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "Neo4j" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to [email protected]. >>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>> >>>>>>> >>>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "Neo4j" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> >>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Neo4j" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "Neo4j" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
