[
https://issues.apache.org/jira/browse/ZOOKEEPER-965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13040095#comment-13040095
]
Marshall McMullen commented on ZOOKEEPER-965:
---------------------------------------------
I've just checked in another change to Ted's github branch that fixes a nasty
bug in the C API that we discovered while integrating the code into our code
base. I think this is a critical fix to the C API that should absolutely be
part of the final version we commit.
>From my commit message:
ZOOKEEPER-965 - fix inoperable multi called from watch context.
While integrating the new zoo_multi and zoo_amulti into the
code at our company, we discovered a nasty bug in the C API. If
a multi op is triggered from a watch callback, then it would hang
indefinitely. This was due to a fundamental flaw in my earlier
implementation whereby I treated all the ops in a multi op as
asynchronous, with a forced synchronous tail completion that I
tacked onto the end of my completion list. BUT, that tail completion
ultimately never got called when called from a watch context!!
The fix was ultimately to create code in both process functions
(both sync and async) for dealing with multi-ops. Each op inside
the multiop is dealt with the same, so they call into a common
function. This way, all the existing logic for dealing with
sync/async completions now deals with multi ops instead of
me trying to deal with that specially.
I've also added a new unit test that demonstrated this bug and
also now passes with my refactoring.
> Need a multi-update command to allow multiple znodes to be updated safely
> -------------------------------------------------------------------------
>
> Key: ZOOKEEPER-965
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-965
> Project: ZooKeeper
> Issue Type: Bug
> Affects Versions: 3.3.3
> Reporter: Ted Dunning
> Assignee: Ted Dunning
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-965.patch, ZOOKEEPER-965.patch,
> ZOOKEEPER-965.patch, ZOOKEEPER-965.patch, ZOOKEEPER-965.patch,
> ZOOKEEPER-965.patch, ZOOKEEPER-965.patch, ZOOKEEPER-965.patch,
> ZOOKEEPER-965.patch, ZOOKEEPER-965.patch, ZOOKEEPER-965.patch,
> ZOOKEEPER-965.patch, ZOOKEEPER-965.patch, ZOOKEEPER-965.patch
>
>
> The basic idea is to have a single method called "multi" that will accept a
> list of create, delete, update or check objects each of which has a desired
> version or file state in the case of create. If all of the version and
> existence constraints can be satisfied, then all updates will be done
> atomically.
> Two API styles have been suggested. One has a list as above and the other
> style has a "Transaction" that allows builder-like methods to build a set of
> updates and a commit method to finalize the transaction. This can trivially
> be reduced to the first kind of API so the list based API style should be
> considered the primitive and the builder style should be implemented as
> syntactic sugar.
> The total size of all the data in all updates and creates in a single
> transaction should be limited to 1MB.
> Implementation-wise this capability can be done using standard ZK internals.
> The changes include:
> - update to ZK clients to all the new call
> - additional wire level request
> - on the server, in the code that converts transactions to idempotent form,
> the code should be slightly extended to convert a list of operations to
> idempotent form.
> - on the client, a down-rev server that rejects the multi-update should be
> detected gracefully and an informative exception should be thrown.
> To facilitate shared development, I have established a github repository at
> https://github.com/tdunning/zookeeper and am happy to extend committer
> status to anyone who agrees to donate their code back to Apache. The final
> patch will be attached to this bug as normal.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira