Hi,
I'm just learning ZK and want to make sure I am understanding everything correctly. In the Barrier Tutorial, it seems like there is a race condition that could cause a possible deadlock when the executed code within the barrier is short and one client has higher latency than another.

For example, say the number of process nodes required to start computation is 2.

1) Process 1 creates node, and enables children watcher.
2) Process 2 creates node and node creation fires watcher notification to process 1. 3) Process 2 retrieves children with list size 2, executes code, and deletes node. 4) Process 1 receives watcher notification from creation of node 2, and requests children, whose size is now 1. 5) Process 1 indefinitely waits for process 2's node to be created, while process 2 indefinitely waits for process 1's node to be deleted.

Are my assumptions of ZK's behavior correct? If so, I can't think of any solutions that are both efficient and correct. The only correct solutions I can think of either requires watches on all children, or sending children nodes and their data to processes multiple times based on a parent data watch event.

To any developers out there, how difficult would it be to customize the ZK code to both send data along with notifications and to have permanent watchers? This would allow notifications for all changes to be guaranteed, sacrificing latency. Having both options would be analogous to having both TCP and UDP protocols available for use depending on the particular requirements of the application.

Thanks,
Justin

Reply via email to