Adding inifinispan-dev, as other people might be interested in this as well.
On 26 Feb 2013, at 10:57, Pedro Ruivo wrote: > hi, > > I found the blocking problem with the state transfer this morning. It happens > because of the reordering of a regular and OOB message. > > Below, is a simplification of what is happening for two nodes > > A: total order broadcasts rebalance_start > > B: (incoming thread) delivers rebalance_start > B: has no segments to request so the rebalance is done > B: sends async request with rebalance_confirm (unicast #x) > B: sends the rebalance_start response (unicast #x+1) (the response is a > regular message) > > A: receives rebalance_start response (unicast #x+1) > A: in UNICAST2, it detects the message is out-of-order and blocks the > response in the sender window (i.e. the message #x is missing) > A: receives the rebalance_confirm (unicast #x) > A: delivers rebalance_confirm. Infinispan blocks this command until all the > rebalance_start responses are received ==> this originates a deadlock! > (because the response is blocked in unicast layer) wondering why does this happen: the "rebalance_start response" (regular unicast #x+1) should not wait for the rebalance_confirm (OOB message) as there's no ordering between them, but be passed up the stack by jgroups. > > Question: can the request's response message be sent always as OOB? (I think > the answer should be no...) > > My suggestion: when I deliver a rebalance_confirm command (that it is send > async), can I move it to a thread in async_thread_pool_executor? > > Weird thing: last night I tried more than 5x time in a row with UNICAST3 and > it never blocks. can this meaning a problem with UNICAST3 or I had just lucky? > > Any other suggestion? > > Cheers, > Pedro > > Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org)
_______________________________________________ infinispan-dev mailing list [email protected] https://lists.jboss.org/mailman/listinfo/infinispan-dev
