On 16 Jun 2014, at 16:57, Dan Berindei <[email protected]> wrote:
> On Thu, Jun 12, 2014 at 4:54 PM, Galder Zamarreño <[email protected]> wrote: > Hi all, > > I’m working on the implementation of this, and the solution noted in the JIRA > does not work for situations where you have to return a previous value that > might have been overriden due to partial operation application. Example > (assuming 1 owner only): > > 1. remote client does a put(“key”, 1) in Node-A > 2. remote client does a replace(“key”, 2) on Node-A but the operation fails > to replicate and gets partially applieed in Node-A only. > 3. remote client, seeing the replace failed, retries the replace(“key”, 2) in > Node-B. replace underneath finds that the previous value of “key” is 2 (since > it got partially applied in Node-A), so it succeeds but the previous value > returned is 2, which is wrong. The previous value should have been 1, but > this value is gone… > > In my view, there is two ways to solve this issue: > > 1. Make Hot Rod caches transactional. By doign so, operations are not > partially applied. They’re done fully cluster wide or they’re rolled back. > I’ve verified this and the test passes once you make the cache transactional. > The downside is of course performance. IOW, anyone using conditional > operations, or relying on previous values, would need transactional caches. > This should work well with the retry mechanism being implemented for > ISPN-2956, which is still needed. A big question here is whether Hot Rod > caches should be transactional by default or viceversa. If they’re > transactional, our performance will go down for sure but we won’t have this > type of issues (with retry in place). If you’re not transactional, you are > faster but you’re exposed to these edge cases, and you need to consider them > when deploying your app, something people might miss, although we could > provide WARN messages when conditional operations or Flag.FORCE_RETURN_VALUE > is used with non-transactional caches. > > The same problem [1] can appear in embedded caches. So if there's an argument > to make HotRod caches transactional by default, it should apply to embedded > caches as well. > > I like the idea of logging a warning when the FORCE_RETURN_VALUE or the > non-versioned conditional operations are used from HotRod. But we have to be > careful with the wording, because inconsistencies can appear in transactional > mode as well [2]. I think this (default non-transaction, and WARN if using such methods with non-transactional cache) might be the best stop gap solution. > [1] https://issues.jboss.org/browse/ISPN-4286 > [2] https://issues.jboss.org/browse/ISPN-3421 Have you considered other fixes for [2]? > > > 2. Get rid of returning previous value in the Hot Rod protocol for modifying > operations. For conditional operations, returning true/false is at least > enough to see if the condition was applied. So, > replaceIfUnmodified/replace/remove(conditional), they would only return > true/false. This would be complicated due to reliance on Map/ConcurrentMap > APIs. Maybe something to consider for when we stop relying on JDK APIs. > > > I'm not sure we can completely get rid of the return values, even though > JCache doesn't extend Map it still has a getAndPut method. It does have such method but we could potentially make it throw an exception saying that is not supported. > > > I also considered applying corrective actions but that’s very messy and prone > to concurrency issues, so I quickly discarded that. > > Yeah, rolling back partial modifications is definitely not an option. > > > Any other options? Thoughts on the options above? > > I was waiting for Sanne to say this, but how about keeping a version history? > > If we had the chain of values/versions for each key, we could look up the > version of the current operation in the chain when retrying. If it's there, > we could infer that the operation was already applied, and return the > previous value in the chain as the previous version. > > Of course, there are the usual downsides: > 1. Maintaining the history might be tricky, especially around state transfers. > 2. Performance will also be affected, maybe getting closer to the tx > performance. Yeah, keeping version history could help, but it’d be quite a beast to implement for 7.0, including garbage collection of old versions...etc. > > > Cheers, > > On 26 May 2014, at 18:11, Galder Zamarreño <[email protected]> wrote: > > > Hi all, > > > > I’ve been looking into ISPN-2956 last week and I think we have a solution > > for it which requires a protocol change [1] > > > > Since we’re in the middle of the Hot Rod 2.0 development, this is a good > > opportunity to implement it. > > > > Cheers, > > > > [1] > > https://issues.jboss.org/browse/ISPN-2956?focusedCommentId=12970541&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12970541 > > > > On 14 May 2014, at 09:36, Dan Berindei <[email protected]> wrote: > > > >> > >> > >> > >> On Tue, May 13, 2014 at 6:40 PM, Radim Vansa <[email protected]> wrote: > >> On 05/13/2014 03:58 PM, Dan Berindei wrote: > >>> > >>> > >>> On Mon, May 12, 2014 at 1:54 PM, Radim Vansa <[email protected]> wrote: > >>> @Dan: It's absolutely correct to do the further writes in order to make > >>> the cache consistent, I am not arguing against that. You've fixed the > >>> outcome (state of cache) well. My point was that we should let the user > >>> know that the value he gets is not 100% correct when we already know > >>> that - and given the API, the only option to do that seems to me as > >>> throwing an exception. > >>> > >>> The problem, as I see it, is that users also expect methods that throw an > >>> exception to *not* modify the cache. > >>> So we would break some of the users' expectations anyway. > >> > >> When the response from primary owner does not arrive soon, we throw > >> timeout exception and the cache is modified anyway, isn't it? > >> If we throw ~ReturnValueUnreliableException, the user has at least some > >> chance to react. Currently, for code requiring 100% reliable value, you > >> can't do anything but ignore the return value, even for CAS operations. > >> > >> > >> Yes, but we don't expect the user to handle a TimeoutException in any > >> meaningful way. Instead, we expect the user to choose his hardware and > >> configuration to avoid timeouts, if he cares about consistency. How could > >> you handle an exception that tells you "I may have written the value you > >> asked me to in the cache, or maybe not. Either way, you will never know > >> what the previous value was. Muahahaha!" in an application that cares > >> about consistency? > >> > >> But the proposed ReturnValueUnreliableException can't be avoided by the > >> user, it has to be handled every time the cluster membership changes. So > >> it would be more like WriteSkewException than TimeoutException. And when > >> we throw a WriteSkewException, we don't write anything to the cache. > >> > >> Remember, most users do not care about the previous value at all - that's > >> the reason why JCache and our HotRod client don't return the previous > >> value by default. Those that do care about the previous value, use the > >> conditional write operations, and those already work (well, except for the > >> scenario below). So you would force everyone to handle an exception that > >> they don't care about. > >> > >> It would make sense to throw an exception if we didn't return the previous > >> value by default, and the user requested the return value explicitly. But > >> we do return the value by default, so I don't think it would be a good > >> idea for us. > >> > >>> > >>> > >>> @Sanne: I was not suggesting that for now - sure, value versioning is (I > >>> hope) on the roadmap. But that's more complicated, I though just about > >>> making an adjustment to the current implementation. > >>> > >>> > >>> Actually, just keeping a history of values would not fix the the return > >>> value in all cases. > >>> > >>> When retrying a put on the new primary owner, the primary owner would > >>> still have to compare our value with the latest value, and return the > >>> previous value if they are equal. So we could have something like this: > >>> > >>> A is the originator, B is the primary owner, k = v0 > >>> A -> B: put(k, v1) > >>> B dies before writing v, C is now primary owner > >>> D -> C: put(k, v1) // another put operation from D, with the same value > >>> C -> D: null > >>> A -> C: retry_put(k, v1) > >>> C -> A: v0 // C assumes A is overwriting its own value, so it's returning > >>> the previous one > >>> > >>> To fix that, we'd need a unique version generated by the originator - > >>> kind of like a transaction id ;) > >> > >> Is it such a problem to associate unique ID with each write? History > >> implementation seems to me like the more complicated part. > >> > >> I also think maintaining a version history would be quite complicated, and > >> it also would make it harder for users to estimate their cache's memory > >> usage. That's why I was trying to show that it's not a panacea. > >> > >> > >> > >>> And to fix the HotRod use case, the HotRod client would have to be the > >>> one generating the version. > >> > >> I agree. > >> > >> Radim > >> > >> > >>> > >>> Cheers > >>> Dan > >>> > >>> > >>> > >>> Radim > >>> > >>> On 05/12/2014 12:02 PM, Sanne Grinovero wrote: > >>>> I don't think we are in a position to decide what is a reasonable > >>>> compromise; we can do better. > >>>> For example - as Radim suggested - it might seem reasonable to have > >>>> the older value around for a little while. We'll need a little bit of > >>>> history of values and tombstones anyway for many other reasons. > >>>> > >>>> > >>>> Sanne > >>>> > >>>> On 12 May 2014 09:37, Dan Berindei <[email protected]> wrote: > >>>>> Radim, I would contend that the first and foremost guarantee that put() > >>>>> makes is to leave the cache in a consistent state. So we can't just > >>>>> throw an > >>>>> exception and give up, leaving k=v on one owner and k=null on another. > >>>>> > >>>>> Secondly, put(k, v) being atomic means that it either succeeds, it > >>>>> writes > >>>>> k=v in the cache, and it returns the previous value, or it doesn't > >>>>> succeed, > >>>>> and it doesn't write k=v in the cache. Returning the wrong previous > >>>>> value is > >>>>> bad, but leaving k=v in the cache is just as bad, even if the all the > >>>>> owners > >>>>> have the same value. > >>>>> > >>>>> And last, we can't have one node seeing k=null, then k=v, then k=null > >>>>> again, > >>>>> when the only write we did on the cache was a put(k, v). So trying to > >>>>> undo > >>>>> the write would not help. > >>>>> > >>>>> In the end, we have to make a compromise, and I think returning the > >>>>> wrong > >>>>> value in some of the cases is a reasonable compromise. Of course, we > >>>>> should > >>>>> document that :) > >>>>> > >>>>> I also believe ISPN-2956 could be fixed so that HotRod behaves just like > >>>>> embedded mode after the ISPN-3422 fix, by adding a RETRY flag to the > >>>>> HotRod > >>>>> protocol and to the cache itself. > >>>>> > >>>>> Incidentally, transactional caches have a similar problem when the > >>>>> originator leaves the cluster: ISPN-3421 [1] > >>>>> And we can't handle transactional caches any better than > >>>>> non-transactional > >>>>> caches until we expose transactions to the HotRod client. > >>>>> > >>>>> [1] https://issues.jboss.org/browse/ISPN-2956 > >>>>> > >>>>> Cheers > >>>>> Dan > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> On Mon, May 12, 2014 at 10:21 AM, Radim Vansa <[email protected]> wrote: > >>>>>> Hi, > >>>>>> > >>>>>> recently I've stumbled upon one already expected behaviour (one > >>>>>> instance > >>>>>> is [1]), but which did not got much attention. > >>>>>> > >>>>>> In non-tx cache, when the primary owner fails after the request has > >>>>>> been > >>>>>> replicated to backup owner, the request is retried in the new topology. > >>>>>> Then, the operation is executed on the new primary (the previous > >>>>>> backup). The outcome has been already fixed in [2], but the return > >>>>>> value > >>>>>> may be wrong. For example, when we do a put, the return value for the > >>>>>> second attempt will be the currently inserted value (although the entry > >>>>>> was just created). Same situation may happen for other operations. > >>>>>> > >>>>>> Currently, it's not possible to return the correct value (because it > >>>>>> has > >>>>>> already been overwritten and we don't keep a history of values), but > >>>>>> shouldn't we rather throw an exception if we were not able to fulfil > >>>>>> the > >>>>>> API contract? > >>>>>> > >>>>>> Radim > >>>>>> > >>>>>> [1] https://issues.jboss.org/browse/ISPN-2956 > >>>>>> [2] https://issues.jboss.org/browse/ISPN-3422 > >>>>>> > >>>>>> -- > >>>>>> Radim Vansa <[email protected]> > >>>>>> JBoss DataGrid QA > >>>>>> > >>>>>> _______________________________________________ > >>>>>> infinispan-dev mailing list > >>>>>> [email protected] > >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> infinispan-dev mailing list > >>>>> [email protected] > >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>>> _______________________________________________ > >>>> infinispan-dev mailing list > >>>> [email protected] > >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>> > >>> > >>> -- > >>> Radim Vansa <[email protected]> > >>> JBoss DataGrid QA > >>> > >>> _______________________________________________ > >>> infinispan-dev mailing list > >>> [email protected] > >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>> > >>> > >>> > >>> _______________________________________________ > >>> infinispan-dev mailing list > >>> > >>> [email protected] > >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> > >> > >> -- > >> Radim Vansa > >> <[email protected]> > >> > >> JBoss DataGrid QA > >> > >> > >> _______________________________________________ > >> infinispan-dev mailing list > >> [email protected] > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> > >> _______________________________________________ > >> infinispan-dev mailing list > >> [email protected] > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > -- > > Galder Zamarreño > > [email protected] > > twitter.com/galderz > > > > > > _______________________________________________ > > infinispan-dev mailing list > > [email protected] > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > -- > Galder Zamarreño > [email protected] > twitter.com/galderz > > > _______________________________________________ > infinispan-dev mailing list > [email protected] > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > [email protected] > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Galder Zamarreño [email protected] twitter.com/galderz _______________________________________________ infinispan-dev mailing list [email protected] https://lists.jboss.org/mailman/listinfo/infinispan-dev
