[
https://issues.apache.org/jira/browse/HBASE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036529#comment-13036529
]
Liyin Tang commented on HBASE-3065:
-----------------------------------
Thanks stack. If you cannot patch it, I can submit a new patch later. Since the
89 has a different way to use zk, I need more more time to debug the failure of
the unit tests:)
Thanks Liyin
> Retry all 'retryable' zk operations; e.g. connection loss
> ---------------------------------------------------------
>
> Key: HBASE-3065
> URL: https://issues.apache.org/jira/browse/HBASE-3065
> Project: HBase
> Issue Type: Bug
> Reporter: stack
> Assignee: Liyin Tang
> Fix For: 0.92.0
>
> Attachments: 3065-v3.txt, HBase-3065[r1088475]_1.patch,
> hbase3065_2.patch
>
>
> The 'new' master refactored our zk code tidying up all zk accesses and
> coralling them behind nice zk utility classes. One improvement was letting
> out all KeeperExceptions letting the client deal. Thats good generally
> because in old days, we'd suppress important state zk changes in state. But
> there is at least one case the new zk utility could handle for the
> application and thats the class of retryable KeeperExceptions. The one that
> comes to mind is conection loss. On connection loss we should retry the
> just-failed operation. Usually the retry will just work. At worse, on
> reconnect, we'll pick up the expired session event.
> Adding in this change shouldn't be too bad given the refactor of zk corralled
> all zk access into one or two classes only.
> One thing to consider though is how much we should retry. We could retry on
> a timer or we could retry for ever as long as the Stoppable interface is
> passed so if another thread has stopped or aborted the hosting service, we'll
> notice and give up trying. Doing the latter is probably better than some
> kinda timeout.
> HBASE-3062 adds a timed retry on the first zk operation. This issue is about
> generalizing what is over there across all zk access.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira