[ 
https://issues.apache.org/jira/browse/HELIX-748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16558786#comment-16558786
 ] 

Jiajun Wang commented on HELIX-748:
-----------------------------------

Change to something like this:

public <T> T retryUntilConnected(final Callable<T> callable)
throws IllegalArgumentException, ZkException {
if (_zookeeperEventThread != null && Thread.currentThread() == 
_zookeeperEventThread) {
throw new IllegalArgumentException("Must not be done in the zookeeper event 
thread.");
}
final long operationStartTime = System.currentTimeMillis();
while (true) {
if (_closed) {
throw new IllegalStateException("ZkClient already closed!");
}
try {
final ZkConnection zkConnection = (ZkConnection) getConnection();
// Validate that the connection is not null before trigger callback
if (zkConnection == null || zkConnection.getZookeeper() == null) {
LOG.debug(
"ZkConnection is in invalid state! Retry until timeout or ZkClient closed.");
} else {
return callable.call();
}
} catch (InterruptedException e) {
throw new ZkInterruptedException(e);
} catch (Exception e) {
// we give the ZkClient some time to fix the connection issue.
Thread.yield();
waitForRetry();
}
// before attempting a retry, check whether retry timeout has elapsed
if (System.currentTimeMillis() - operationStartTime > 
_operationRetryTimeoutInMillis) {
throw new ZkTimeoutException(
"Operation cannot be retried because of retry timeout (" + 
_operationRetryTimeoutInMillis
+ " milli seconds)");
}
}
}

 

Need to validate if any corner cases and adding test cases.

> ZkClient should not throw Exception when internal ZkConnection is reset
> -----------------------------------------------------------------------
>
>                 Key: HELIX-748
>                 URL: https://issues.apache.org/jira/browse/HELIX-748
>             Project: Apache Helix
>          Issue Type: Task
>            Reporter: Jiajun Wang
>            Assignee: Jiajun Wang
>            Priority: Major
>
> It is noticed that ZkClient throws an exception because of ZkConnection == 
> null when it is reset.
> This could be caused by an expiring session handling. According to the 
> design, ZkClient operation should wait until reset done, instead of break the 
> retry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to