Unfortunately, the recipes don’t list edge cases. The algorithm is described 
here:

http://zookeeper.apache.org/doc/trunk/recipes.html#sc_doubleBarriers

As I recall, when I wrote it I just followed the recipe as listed. I think some 
others have contributed to the code as well. I never really understood what the 
use-case for this recipe is - I’ve never needed it.

Let’s add time to your scenario (where EB is entered/blocked, C is crashed, W 
is working and * is not active):

     C1     C2     C3   C4
T1   EB      *      *    *
T2   EB     EB      *    *
T3   EB      C      *    *
T4   EB      *     EB    *
       
It seems to me that at T4, C1 cannot reliably know that C2 ever entered. It may 
have gotten a notification or not depending on timing. So, the state of the 
barrier is unknown, right? If C1 gets notified prior to C2 crashing, though, 
the barrier should proceed by definition. So, I guess the real question is how 
should the barrier handle errors. i.e., at T2 it has enough members to continue 
but after that one of the members leaves prematurely. We may have to allow 
users to specify the behavior. If users can handle members crashing, then it 
works predictably:

     C1     C2     C3   C4
T1   EB      *      *    *
T2   EB     EB      *    *
T3   EB      C      *    *
T4   EB      *     EB    *
T5   Barrier can start as there enough members
T6   W       *      C    *   this is OK because we had 2 enter at T5
T7   Here C1 blocks and times out on leave

On August 25, 2015 at 10:15:55 AM, Mike Drob ([email protected]) wrote:

Devs,  

I was working on CURATOR-233 and I realized that I don't really understand  
the semantics of the DistributedDoubleBarrier when there are more clients  
than the given member quantity. Note that I'm not asking about what the  
code currently does (because I suspect it has a few inconsistencies) but  
what it should do according to the API contract.  

Let's suppose I create a DDB with n=2.  

client1.enter() // blocks until at least 2 clients enter  
client2.enter() // returns immediately  

client3.enter() // returns immediately?  

client3.leave() // blocks until all clients leave?  
client2.leave() // blocks until all clients leave?  

client4.enter() // imagine a straggling thread, no idea what should even  
happen here  

Mike  

Reply via email to