[
https://issues.apache.org/jira/browse/HBASE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13076032#comment-13076032
]
Ted Yu commented on HBASE-3065:
-------------------------------
In RecoverableZooKeeper, should we make the handling of zero length data in
appendMetaData() and removeMetaData() symmetrical ?
I mean this change:
{code}
private byte[] appendMetaData(byte[] data) {
- if(data == null){
+ if(data == null || data.length == 0){
return null;
}
{code}
> Retry all 'retryable' zk operations; e.g. connection loss
> ---------------------------------------------------------
>
> Key: HBASE-3065
> URL: https://issues.apache.org/jira/browse/HBASE-3065
> Project: HBase
> Issue Type: Bug
> Reporter: stack
> Assignee: Liyin Tang
> Priority: Blocker
> Fix For: 0.92.0
>
> Attachments: 3065-v3.txt, 3065-v4.txt, HBASE-3065-addendum.patch,
> HBase-3065[r1088475]_1.patch, hbase3065_2.patch
>
>
> The 'new' master refactored our zk code tidying up all zk accesses and
> coralling them behind nice zk utility classes. One improvement was letting
> out all KeeperExceptions letting the client deal. Thats good generally
> because in old days, we'd suppress important state zk changes in state. But
> there is at least one case the new zk utility could handle for the
> application and thats the class of retryable KeeperExceptions. The one that
> comes to mind is conection loss. On connection loss we should retry the
> just-failed operation. Usually the retry will just work. At worse, on
> reconnect, we'll pick up the expired session event.
> Adding in this change shouldn't be too bad given the refactor of zk corralled
> all zk access into one or two classes only.
> One thing to consider though is how much we should retry. We could retry on
> a timer or we could retry for ever as long as the Stoppable interface is
> passed so if another thread has stopped or aborted the hosting service, we'll
> notice and give up trying. Doing the latter is probably better than some
> kinda timeout.
> HBASE-3062 adds a timed retry on the first zk operation. This issue is about
> generalizing what is over there across all zk access.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira