[
https://issues.apache.org/jira/browse/HBASE-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102973#comment-13102973
]
stack commented on HBASE-4357:
------------------------------
@Ming In TRUNK, we have RecoverableZooKeeper. It does the following when its
trying to get version:
{code}
/**
* exists is an idempotent operation. Retry before throw out exception
* @param path
* @param watcher
* @return
* @throws KeeperException
* @throws InterruptedException
*/
public Stat exists(String path, Watcher watcher)
throws KeeperException, InterruptedException {
RetryCounter retryCounter = retryCounterFactory.create();
while (true) {
try {
return zk.exists(path, watcher);
} catch (KeeperException e) {
switch (e.code()) {
case CONNECTIONLOSS:
case OPERATIONTIMEOUT:
LOG.warn("Possibly transient ZooKeeper exception: " + e);
if (!retryCounter.shouldRetry()) {
LOG.error("ZooKeeper exists failed after "
+ retryCounter.getMaxRetries() + " retries");
throw e;
}
break;
default:
throw e;
}
}
LOG.info("The "+retryCounter.getAttemptTimes()+" times to retry " +
"ZooKeeper after sleeping "+retryIntervalMillis+" ms");
retryCounter.sleepUntilNextRetry();
retryCounter.useRetry();
}
}
{code}
That is, it retries.
We should probably do your #2 above too.
> Region in transition - in closing state
> ---------------------------------------
>
> Key: HBASE-4357
> URL: https://issues.apache.org/jira/browse/HBASE-4357
> Project: HBase
> Issue Type: Bug
> Reporter: Ming Ma
>
> Got the following during testing,
> 1. On a given machine, kill "RS process id". Then kill "HMaster process id".
> 2. Start RS first via "bin/hbase-daemon.sh --config ./conf start
> regionserver.". Start HMaster via "bin/hbase-daemon.sh --config ./conf start
> master".
> One region of a table stayed in closing state.
> According to zookeeper,
> 794a6ff17a4de0dd0a19b984ba18eea9
> miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
> state=CLOSING, ts=Wed Sep 07 17:21:44 PDT 2011 (75701s ago),
> server=sea-esxi-0,60000,1315428682281
> According to .META. table, the region has been assigned to from sea-esxi-0 to
> sea-esxi-4.
> miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
> sea-esxi-4:60030 H\xB49X\x10bM\xB1 I7K\xC6\xA7\xEF\x9D\x90 0
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira