Hi guys I've updated the document at https://community.jboss.org/wiki/Non-blockingStateTransfer This time I've added an overview and changed "locking requirements" to "command execution during state transfer". I also assumed that we are going to use the tree of cache view.
More comments inline. On Fri, Mar 16, 2012 at 9:16 AM, Bela Ban <[email protected]> wrote: > > > On 3/15/12 11:29 AM, Dan Berindei wrote: > >> That was basically what we did in the blocking design: the ST commands >> could execute during ST, but regular commands would block until the >> end of the ST. With async caches, that meant we would use JGroups' 1 >> queue per sender (so not a global queue, but close). >> >> The problem was not with the regular commands that arrived after the >> start of the ST, but with the commands that had already started >> executing when ST started. This is the classic example: >> 1. A prepare command for Tx1 locks k1 on node A >> 2. A prepare command for Tx2 tries to acquire lock k1 on node A >> 3. State transfer starts up and blocks all write commands >> 4. The Tx1 commit command, which will unlock k1, arrives but can't run >> until state transfer has ended >> 5. The Tx2 prepare command times out on the lock acquisition after 10 >> seconds (by default) >> 6. State transfer can can now proceed and push or receive data. >> 7. The Tx1 commit can now run and unlock k1. It's too late for Tx2, however. >> >> The solution I had in mind for the old design was to add some kind of >> deadlock detection to the LockManager and throw a >> StateTransferInProgress when a deadlock with the state transfer is >> detected. > > > OK. I don't like the old design, as ST has to wait until all pending TXs > (those with locks held) have to commit before we can make progress. If > the lock acquition timeout is high, we'll have to wait for a long time. > > >> With the new design I thought it would be simpler to not acquire a big >> lock for the entire duration of the write command that would prevent >> state transfer. Instead I would acquire different locks for much >> shorter amounts of time, and at the beginning of each lock acquisition >> we would just check that the command's view id is still the correct >> one. > > > OK. Perhaps an overview of the new design in the document is warranted. > There's a section on transfer of CacheEntries and one on locks, but I > didn't see a combined discussion. Perhaps an example like the one above > would be good ? > I hope I've improved on this in the new version. > I now realize how much simpler the use of total order is here: since all > updates in a cluster happen in total order, we don't need to acquire > locks in 1 phase and release them in another phase. ST is then just > another update, inserted at a certain place in the stream of updates. > Unfortunately I think that would mean a blocking state transfer, because all the other updates would have to wait on state to be transferred. > I assume the Cloud-TM guys don't do state transfer in their prototype, > or do they ? Pedro ? If not, then there needs to be an implementation of > ST for TO. > I'm curious about this as well. Cheers Dan _______________________________________________ infinispan-dev mailing list [email protected] https://lists.jboss.org/mailman/listinfo/infinispan-dev
