[
https://issues.apache.org/jira/browse/KAFKA-1155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13942365#comment-13942365
]
Jay Kreps commented on KAFKA-1155:
----------------------------------
[~nehanarkhede] Is this still open? I think this is by design in zookeeper. You
are supposed to only depend on the final state. Clearly a log of all changes
would be preferable...if only someone was building that :-)
> Kafka server can miss zookeeper watches during long zkclient callbacks
> ----------------------------------------------------------------------
>
> Key: KAFKA-1155
> URL: https://issues.apache.org/jira/browse/KAFKA-1155
> Project: Kafka
> Issue Type: Bug
> Components: controller
> Affects Versions: 0.8.0, 0.8.1
> Reporter: Neha Narkhede
> Assignee: Neha Narkhede
> Priority: Critical
>
> On getting a zookeeper watch, zkclient invokes the blocking user callback and
> only re-registers the watch after the callback returns. This leaves a
> possibly large window of time when Kafka has not registered for watches on
> the desired zookeeper paths and hence can miss important state changes (on
> the controller). In any case, it is worth noting that even though zookeeper
> has a read-and-set-watch API, there can always be a window of time between
> the watch being fired, the callback and the read-and-set-watch API call. Due
> to the zkclient wrapper, it is difficult to handle this properly in the Kafka
> code unless we directly use the zookeeper client. One way of getting around
> this issue is to use timestamps on the paths and when a watch fires, check if
> the timestamp in zk is different from the one in the callback handler.
--
This message was sent by Atlassian JIRA
(v6.2#6252)