[
https://issues.apache.org/jira/browse/ZOOKEEPER-2619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15632820#comment-15632820
]
Benjamin Reed commented on ZOOKEEPER-2619:
------------------------------------------
i would love to see ZOOKEEPER-22 fixed, but i don't think it will be fixed
anytime soon. (it would be awesome to be surprised though :)
@diego perhaps you could implement your idea in your go client implementation
and propose it again if it works out well? i like the getConnection proposal.
> Client library reconnecting breaks FIFO client order
> ----------------------------------------------------
>
> Key: ZOOKEEPER-2619
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2619
> Project: ZooKeeper
> Issue Type: Bug
> Reporter: Diego Ongaro
>
> According to the USENIX ATC 2010
> [paper|https://www.usenix.org/conference/usenix-atc-10/zookeeper-wait-free-coordination-internet-scale-systems],
> ZooKeeper provides "FIFO client order: all requests from a given client are
> executed in the order that they were sent by the client." I believe
> applications written using the Java client library are unable to rely on this
> guarantee, and any current application that does so is broken. Other client
> libraries are also likely to be affected.
> Consider this application, which is simplified from the algorithm described
> on Page 4 (right column) of the paper:
> {code}
> zk = new ZooKeeper(...)
> zk.createAsync("/data-23857", "...", callback)
> zk.createSync("/pointer", "/data-23857")
> {code}
> Assume an empty ZooKeeper database to begin with and no other writers.
> Applying the above definition, if the ZooKeeper database contains /pointer,
> it must also contain /data-23857.
> Now consider this series of unfortunate events:
> {code}
> zk = new ZooKeeper(...)
> // The library establishes a TCP connection.
> zk.createAsync("/data-23857", "...", callback)
> // The library/kernel closes the TCP connection because it times out, and
> // the create of /data-23857 is doomed to fail with ConnectionLoss. Suppose
> // that it never reaches the server.
> // The library establishes a new TCP connection.
> zk.createSync("/pointer", "/data-23857")
> // The create of /pointer succeeds.
> {code}
> That's the problem: subsequent operations get assigned to the new connection
> and succeed, while earlier operations fail.
> In general, I believe it's impossible to have a system with the following
> three properties:
> # FIFO client order for asynchronous operations,
> # Failing operations when connections are lost, AND
> # Transparently reconnecting when connections are lost.
> To argue this, consider an application that issues a series of pipelined
> operations, then upon noticing a connection loss, issues a series of recovery
> operations, repeating the recovery procedure as necessary. If a pipelined
> operation fails, all subsequent operations in the pipeline must also fail.
> Yet the client must also carry on eventually: the recovery operations cannot
> be trivially failed forever. Unfortunately, the client library does not know
> where the pipelined operations end and the recovery operations begin. At the
> time of a connection loss, subsequent pipelined operations may or may not be
> queued in the library; others might be upcoming in the application thread. If
> the library re-establishes a connection too early, it will send pipelined
> operations out of FIFO client order.
> I considered a possible workaround of having the client diligently check its
> callbacks and watchers for connection loss events, and do its best to stop
> the subsequent pipelined operations at the first sign of a connection loss.
> In addition to being a large burden for the application, this does not solve
> the problem all the time. In particular, if the callback thread is delayed
> significantly (as can happen due to excessive computation or scheduling
> hiccups), the application may not learn about the connection loss event until
> after the connection has been re-established and after dependent pipelined
> operations have already been transmitted over the new connection.
> I suggest the following API changes to fix the problem:
> - Add a method ZooKeeper.getConnection() returning a ZKConnection object.
> ZKConnection would wrap a TCP connection. It would include all synchronous
> and asynchronous operations currently defined on the ZooKeeper class. Upon a
> connection loss on a ZKConnection, all subsequent operations on the same
> ZKConnection would return a Connection Loss error. Upon noticing, the client
> would need to call ZooKeeper.getConnection() again to get a working
> ZKConnection object, and it would execute its recovery procedure on this new
> connection.
> - Deprecate all asynchronous methods on the ZooKeeper object. These are
> unsafe to use if the caller assumes they're getting FIFO client order.
> - No changes to the protocols or servers are required.
> I recognize this could cause a lot of code churn for both ZooKeeper and
> projects that use it. On the other hand, the existing asynchronous calls in
> applications should now be audited anyhow.
> The code affected by this issue may be difficult to contain:
> - It likely affects all ZooKeeper client libraries that provide both
> asynchronous operations and transparent reconnection. That's probably all
> versions of the official Java client library, as well as most other client
> libraries.
> - It affects all applications using those libraries that depend on the FIFO
> client order of asynchronous operations. I don't know how common that is, but
> the paper implies that FIFO client order is important.
> - Fortunately, the issue can only manifest itself when connections are lost
> and transparently reestablished. In practice, it may also require a long
> pipeline or a significant delay in the application thread while the library
> establishes a new connection.
> - In case you're wondering, this issue occurred to me while working on a new
> client library for Go. I haven't seen this issue in the wild, but I was able
> to produce it locally by placing sleep statements in a Java program and
> closing its TCP connections.
> I'm new to this community, so I'm looking forward to the discussion. Let me
> know if I can clarify any of the above.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)