On Mon, Nov 14, 2016 at 4:49 PM, Pedro Ruivo <pe...@infinispan.org> wrote: > > > On 14-11-2016 14:26, Radim Vansa wrote: >> Hi, >> >> I was thinking about ISPN-3918 [1] and I've realized that while this >> happens in current implementation only rarely during state transfer, >> with Triangle v4 this could happen more often. >> >> Conditional command is always executed on primary owner, and so far >> during the execution of conditional command (incl. replication to >> backup-owners) the other commands to the same key were blocking in the >> locking layer. Triangle v4 removes this blocking, and if in thread T1 >> you do: >> >> T1: replace(key, A, B) >> >> and in second thread T2 >> >> T2: replace(key, A, C) >> T2: get(key) >> >> the T2.replace can now fail before the T1.replace (successful) is >> replicated to backup owner. When T2 is, by chance, the backup owner, the >> T2.replace completes with false, the T2.get will be served locally and >> it will still returns A. >> >> We should decide if this is an issue, and either close ISPN-3918 (not a >> bug) or think about triangle routing of unsuccessful commands. > > well... I think we could send the unsuccessful ack in FIFO order(*1). In > this way, it would force the backup owner to process the T1 operation > before processing the ack. get() will then return the correct value. > > *1 or send only in FIFO when the backup owner is the originator and the > command is unsuccessful. > *1 or merge the ack command + backup-write command and send them in FIFO >
Merging sounds like it would send too much extra stuff to the originator. Sending only the ack command to the originator when it's also a backup owner (and making it FIFO) sounds much better :) OTOH, having T2 run on the backup owner guarantees that get() will look up the key locally, but there is a chance of that happening when T2 runs on any non-owner. So I don't think making the ack command FIFO would really solve the problem. The more I think about it, the more it seems this bug is just another example of distributed caches not having session consistency [1]. The fact is that distributed caches allow read operations to return the values of concurrent writes in a any order, and having one of those reads be also a write muddies the water a bit, but doesn't really change anything. (Except, of course, an implementation that makes it preserve the order most of the time in master.) I vote to close ISPN-3918, but I would like to open another issue (or reuse this one) to add a "force consistent read" operation/flag/configuration that would force the cache to read the value from the primary owner. We've been talking about this a lot, and at the very least we need to have the option so we know whether users actually choose it. [1]: https://github.com/infinispan/infinispan/wiki/Consistency-guarantees-in-Infinispan#41-non-transactional _______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev