Hello. I am trying to implement distributed executor service over a zookeeper and encountered a problem I am not sure how to deal with. First of all, let me show you the way I am trying to implements this. I create a number of prefix nodes for different messages belonging to the same job: Job, Queued, Reserved, Result When someone want to submit a job, he creates EPHEMERAL_SEQUENTIAL node under Job. The name returned will be job name, it will be used under every other prefix. All created nodes are EPHEMERAL Next sequence of OK processing operations looks like the next (S - submitter, P - processor) 1) S creates marker under Queued prefix 2) P creates marker under Reserved prefix 3) S removes marker under Queued prefix 4) P sets data under Result prefix 5) S gets data and removes all the nodes 6) P (after Job is removed) also removes all the nodes left This sheme supports both Processor problems (Reserved marker will disappear and Submitter will restart from (1)) and Submitter problems (as soon as Job dissappear, job is treated as cancelled by Processor). The problem is that I need to use watch for data removal. And exists watch for data removal may produce memory leaks if data was already removed on the time of exists call - the watch will be registered, but never triggered. And I don't see a way to clean-up watches on given path. While writing this e-mail I'we got a thought - I can use getData to set up a watch for removal. As far as I can see from client code, watch won't be set unless item does exist. Another problem is waiting for "Reserved" to appear - I need to set up exists watch, but there may be case when task is cancelled before it was taken by any processor. Should I make a fake "Reserved" item and then instantly remove it just to trigger the watch (and make it go to GC)? If so, can this be done as asynchronious calls without any callback?
-- Best regards, Vitalii Tymchyshyn
