Hi all, I’ve been looking into ISPN-2956 last week and I think we have a solution for it which requires a protocol change [1]
Since we’re in the middle of the Hot Rod 2.0 development, this is a good opportunity to implement it. Cheers, [1] https://issues.jboss.org/browse/ISPN-2956?focusedCommentId=12970541&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12970541 On 14 May 2014, at 09:36, Dan Berindei <[email protected]> wrote: > > > > On Tue, May 13, 2014 at 6:40 PM, Radim Vansa <[email protected]> wrote: > On 05/13/2014 03:58 PM, Dan Berindei wrote: >> >> >> On Mon, May 12, 2014 at 1:54 PM, Radim Vansa <[email protected]> wrote: >> @Dan: It's absolutely correct to do the further writes in order to make >> the cache consistent, I am not arguing against that. You've fixed the >> outcome (state of cache) well. My point was that we should let the user >> know that the value he gets is not 100% correct when we already know >> that - and given the API, the only option to do that seems to me as >> throwing an exception. >> >> The problem, as I see it, is that users also expect methods that throw an >> exception to *not* modify the cache. >> So we would break some of the users' expectations anyway. > > When the response from primary owner does not arrive soon, we throw timeout > exception and the cache is modified anyway, isn't it? > If we throw ~ReturnValueUnreliableException, the user has at least some > chance to react. Currently, for code requiring 100% reliable value, you can't > do anything but ignore the return value, even for CAS operations. > > > Yes, but we don't expect the user to handle a TimeoutException in any > meaningful way. Instead, we expect the user to choose his hardware and > configuration to avoid timeouts, if he cares about consistency. How could you > handle an exception that tells you "I may have written the value you asked me > to in the cache, or maybe not. Either way, you will never know what the > previous value was. Muahahaha!" in an application that cares about > consistency? > > But the proposed ReturnValueUnreliableException can't be avoided by the user, > it has to be handled every time the cluster membership changes. So it would > be more like WriteSkewException than TimeoutException. And when we throw a > WriteSkewException, we don't write anything to the cache. > > Remember, most users do not care about the previous value at all - that's the > reason why JCache and our HotRod client don't return the previous value by > default. Those that do care about the previous value, use the conditional > write operations, and those already work (well, except for the scenario > below). So you would force everyone to handle an exception that they don't > care about. > > It would make sense to throw an exception if we didn't return the previous > value by default, and the user requested the return value explicitly. But we > do return the value by default, so I don't think it would be a good idea for > us. > >> >> >> @Sanne: I was not suggesting that for now - sure, value versioning is (I >> hope) on the roadmap. But that's more complicated, I though just about >> making an adjustment to the current implementation. >> >> >> Actually, just keeping a history of values would not fix the the return >> value in all cases. >> >> When retrying a put on the new primary owner, the primary owner would still >> have to compare our value with the latest value, and return the previous >> value if they are equal. So we could have something like this: >> >> A is the originator, B is the primary owner, k = v0 >> A -> B: put(k, v1) >> B dies before writing v, C is now primary owner >> D -> C: put(k, v1) // another put operation from D, with the same value >> C -> D: null >> A -> C: retry_put(k, v1) >> C -> A: v0 // C assumes A is overwriting its own value, so it's returning >> the previous one >> >> To fix that, we'd need a unique version generated by the originator - kind >> of like a transaction id ;) > > Is it such a problem to associate unique ID with each write? History > implementation seems to me like the more complicated part. > > I also think maintaining a version history would be quite complicated, and it > also would make it harder for users to estimate their cache's memory usage. > That's why I was trying to show that it's not a panacea. > > > >> And to fix the HotRod use case, the HotRod client would have to be the one >> generating the version. > > I agree. > > Radim > > >> >> Cheers >> Dan >> >> >> >> Radim >> >> On 05/12/2014 12:02 PM, Sanne Grinovero wrote: >> > I don't think we are in a position to decide what is a reasonable >> > compromise; we can do better. >> > For example - as Radim suggested - it might seem reasonable to have >> > the older value around for a little while. We'll need a little bit of >> > history of values and tombstones anyway for many other reasons. >> > >> > >> > Sanne >> > >> > On 12 May 2014 09:37, Dan Berindei <[email protected]> wrote: >> >> Radim, I would contend that the first and foremost guarantee that put() >> >> makes is to leave the cache in a consistent state. So we can't just throw >> >> an >> >> exception and give up, leaving k=v on one owner and k=null on another. >> >> >> >> Secondly, put(k, v) being atomic means that it either succeeds, it writes >> >> k=v in the cache, and it returns the previous value, or it doesn't >> >> succeed, >> >> and it doesn't write k=v in the cache. Returning the wrong previous value >> >> is >> >> bad, but leaving k=v in the cache is just as bad, even if the all the >> >> owners >> >> have the same value. >> >> >> >> And last, we can't have one node seeing k=null, then k=v, then k=null >> >> again, >> >> when the only write we did on the cache was a put(k, v). So trying to undo >> >> the write would not help. >> >> >> >> In the end, we have to make a compromise, and I think returning the wrong >> >> value in some of the cases is a reasonable compromise. Of course, we >> >> should >> >> document that :) >> >> >> >> I also believe ISPN-2956 could be fixed so that HotRod behaves just like >> >> embedded mode after the ISPN-3422 fix, by adding a RETRY flag to the >> >> HotRod >> >> protocol and to the cache itself. >> >> >> >> Incidentally, transactional caches have a similar problem when the >> >> originator leaves the cluster: ISPN-3421 [1] >> >> And we can't handle transactional caches any better than non-transactional >> >> caches until we expose transactions to the HotRod client. >> >> >> >> [1] https://issues.jboss.org/browse/ISPN-2956 >> >> >> >> Cheers >> >> Dan >> >> >> >> >> >> >> >> >> >> On Mon, May 12, 2014 at 10:21 AM, Radim Vansa <[email protected]> wrote: >> >>> Hi, >> >>> >> >>> recently I've stumbled upon one already expected behaviour (one instance >> >>> is [1]), but which did not got much attention. >> >>> >> >>> In non-tx cache, when the primary owner fails after the request has been >> >>> replicated to backup owner, the request is retried in the new topology. >> >>> Then, the operation is executed on the new primary (the previous >> >>> backup). The outcome has been already fixed in [2], but the return value >> >>> may be wrong. For example, when we do a put, the return value for the >> >>> second attempt will be the currently inserted value (although the entry >> >>> was just created). Same situation may happen for other operations. >> >>> >> >>> Currently, it's not possible to return the correct value (because it has >> >>> already been overwritten and we don't keep a history of values), but >> >>> shouldn't we rather throw an exception if we were not able to fulfil the >> >>> API contract? >> >>> >> >>> Radim >> >>> >> >>> [1] https://issues.jboss.org/browse/ISPN-2956 >> >>> [2] https://issues.jboss.org/browse/ISPN-3422 >> >>> >> >>> -- >> >>> Radim Vansa <[email protected]> >> >>> JBoss DataGrid QA >> >>> >> >>> _______________________________________________ >> >>> infinispan-dev mailing list >> >>> [email protected] >> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> >> >> >> _______________________________________________ >> >> infinispan-dev mailing list >> >> [email protected] >> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > _______________________________________________ >> > infinispan-dev mailing list >> > [email protected] >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> -- >> Radim Vansa <[email protected]> >> JBoss DataGrid QA >> >> _______________________________________________ >> infinispan-dev mailing list >> [email protected] >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> >> _______________________________________________ >> infinispan-dev mailing list >> >> [email protected] >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > -- > Radim Vansa > <[email protected]> > > JBoss DataGrid QA > > > _______________________________________________ > infinispan-dev mailing list > [email protected] > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > [email protected] > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Galder Zamarreño [email protected] twitter.com/galderz _______________________________________________ infinispan-dev mailing list [email protected] https://lists.jboss.org/mailman/listinfo/infinispan-dev
