On 06/08/14 22:32, Erik Salter wrote: > Hi Sanne, > > So I guess I'm one of the five. This particular issue has been happening > over and over again in my production environment. It is a performance and > availability killer, since any sort of network blip will result in these > lock scenarios. And network blips in my data center environment are very > real and surprisingly common. In the past 5-6 weeks, we've had 3 > redundant routers fail. The best thing that could happen is lock > contention. > > Almost any sort of MERGE event will result in "stale" locks -- locks that > are never released by the system.
This is seperate from the put-while-holding-the-lock issue. Have you created a JIRA to keep track of this ? > Now consider that I can't throw more > threads at the problem. I have 1100 OOB + ISPN threads available per > cluster. Understood. > So what I've hacked in production is something that basically examines the > LockManager (and TxTable) and manages locks that are > 5x lock acquisition > time old. If I don't, then I start getting user threads and OOB threads > backed up, and it's surprising how quickly these pools can exhaust. > Nowhere near a scalable and elastic solution, but desperate times and all > that. > > > I can't agree more that any solution must take NBST into consideration, > especially WRT pessimistic locks. > > Regards, > > Erik > > > On 8/6/14, 2:50 PM, "Sanne Grinovero" <sa...@infinispan.org> wrote: > >> On 6 August 2014 18:49, Dan Berindei <dan.berin...@gmail.com> wrote: >>> >>> >>> >>> On Wed, Aug 6, 2014 at 6:19 PM, Bela Ban <b...@redhat.com> wrote: >>>> >>>> Hey Dan, >>>> >>>> On 06/08/14 16:13, Dan Berindei wrote: >>>>> I could create the issue in JIRA, but I wouldn't make it high >>>> priority >>>>> because I think it have lots of corner cases with NBST and cause >>>>> headaches for the maintainers of state transfer ;) >>>> >>>> I do believe the put-while-holding-the-lock issue *is* a critical >>>> issue; >>>> anyone banging a cluster of Infinispan nodes with more than 1 thread >>>> will run into lock timeouts, with or without transactions. The only >>>> workaround for now is to use total order, but at the cost of reduced >>>> performance. However, once a system starts hitting the lock timeout >>>> issues, performance drops to a crawl, way slower than TO, and work >>>> starts to pile up, which compounds the problem. >>> >>> >>> I wouldn't call it critical because you can always increase the number >>> of >>> threads. It won't be pretty, but it will work around the thread >>> exhaustion >>> issue. >> >> If Infinispan doesn't do it automatically, I wouldn't count that as a >> solution. >> Consider the project goal is to make it easy to scale up/down >> dynamically.. if it requires experts to be alert all the time to for >> such manual interventions it's a failure. >> Besides, I can count the names of people able to figure such trouble >> out on a single hand.. so I agree this is critical. >> >> >>>> I believe doing a sync RPC while holding the lock on a key is asking >>>> for >>>> trouble and is (IMO) an anti-pattern. >>> >>> >>> We also hold a lock on a key between the LockControlCommand and the >>> TxCompletionNotificationCommand in pessimistic-locking caches, and >>> there's >>> at least one sync PrepareCommand RPC between them... >>> >>> So I don't see it as an anti-pattern, the only problem is that we >>> should be >>> able to do that without blocking internal threads in addition to the >>> user >>> thread (which is how tx caches do it). >>> >>>> >>>> Sorry if this has a negative impact on NBST, but should we not fix this >>>> because we don't want to risk a change to NBST ? >>> >>> >>> I'm not saying it will have a negative impact on NBST, I'm just saying I >>> don't want to start implementing an incomplete proposal for the basic >>> flow >>> and leave the state transfer/topology change issues for "later". When >>> happens when a node leaves, when a backup owner is added, or when the >>> primary owner changes should be part of the initial discussion, not an >>> afterthought. >> >> Absolutely! >> No change should be done leaving questions open, and I don't presume I >> suggested a solution I was just trying to start a conversation on >> using "such a pattern". >> But also I believe we already had such conversations in past meetings, >> so my words were terse and short because I just wanted to remind about >> those. >> >> >>> E.g. with your proposal, any updates in the replication queue on the >>> primary >>> owner will be lost when that primary owner dies, even though we told the >>> user that we successfully updated the key. To quote from my first email >>> on >>> this thread: "OTOH, if the primary owner dies, we have to ask a backup, >>> and >>> we can lose the modifications not yet replicated by the primary." >>> >>> With Sanne's proposal, we wouldn't report to the user that we stored the >>> value until all the backups confirmed the update, so we wouldn't have >>> that >>> problem. But I don't see how we could keep the sequence of versions >>> monotonous when the primary owner of the key changes without some extra >>> sync >>> RPCs (also done while holding the key lock). IIRC TOA also needs some >>> sync >>> RPCs to generate its sequence numbers. >> >> I don't know how NBST v.21 is working today, but I trust it doesn't >> lose writes and that we should break down the problems in smaller >> problems, in this case I hope to build on the solid foundations of >> NBST. >> >> When the key is re-possessed by a new node (and this starts to >> generate "reference" write commands), you could restart the sequences: >> you don't need an universal monotonic number, all what backup owners >> need is an ordering rule and understand that the commands coming from >> the new owner are more recent than the old owner. AFAIK you already >> have the notion of view generation id? >> Essentially we'd need to store together with the entry not only its >> sequence but also the viewid. It's a very simplified (compact) vector >> clock, because in practice from this viewId we can extrapolate >> addresses and owners.. but is simpler than the full pattern, as you >> only need the last one, as the longer tail of events is handled by >> NBST I think? >> >> One catch is I think you need tombstones, but those are already needed >> for so many things that we can't avoid them :) >> >> Cheers, >> Sanne >> >> >>> >>>> >>>>> Besides, I'm still not sure I understood your proposals properly, >>>> e.g. >>>>> whether they are meant only for non-tx caches or you want to change >>>>> something for tx caches as well... >>>> >>>> I think this can be used for both cases; however, I think either >>>> Sanne's >>>> solution of using seqnos *per key* and updating in the order of seqnos >>>> or using Pedro's total order impl are probably better solutions. >>>> >>>> I'm not pretending these solutions are final (e.g. Sanne's solution >>>> needs more thought when multiple keys are involved), but we should at >>>> least acknowledge the issue exists, create a JIRA to prioritize it and >>>> then start discussing solutions. >>>> >>> >>> We've been discussing solutions without a JIRA just fine :) >>> >>> My feeling so far is that the thread exhaustion problem would be better >>> served by porting TO to non-tx caches and/or changing non-tx locking to >>> not >>> require a thread. I have created an issue for TO [1], but IMO the >>> locking >>> rework [2] should be higher priority, as it can help both tx and non-tx >>> caches. >>> >>> [1] https://issues.jboss.org/browse/ISPN-4610 >>> [2] https://issues.jboss.org/browse/ISPN-2849 >>> >>>> >>>>> >>>>> >>>>> On Wed, Aug 6, 2014 at 1:02 PM, Bela Ban <b...@redhat.com >>>>> <mailto:b...@redhat.com>> wrote: >>>>> >>>>> Seems like this discussion has died with the general agreement >>>> that >>>>> this >>>>> is broken and with a few proposals on how to fix it, but without >>>> any >>>>> follow-up action items. >>>>> >>>>> I think we (= someone from the ISPN team) need to create a JIRA, >>>>> preferably blocking. >>>>> >>>>> WDYT ? >>>>> >>>>> If not, here's what our options are: >>>>> >>>>> #1 I'll create a JIRA >>>>> >>>>> #2 We'll hold the team meeting in Krasnojarsk, Russia >>>>> >>>>> #3 There will be only vodka, no beers in #2 >>>>> >>>>> #4 Bela will join the ISPN team >>>>> >>>>> Thoughts ? >>>> >>>> >>>> -- >>>> Bela Ban, JGroups lead (http://www.jgroups.org) >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev@lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev@lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > -- Bela Ban, JGroups lead (http://www.jgroups.org) _______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev