[
https://issues.apache.org/jira/browse/HBASE-10813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13946396#comment-13946396
]
Ding Yuan commented on HBASE-10813:
-----------------------------------
One comment for this code block:
{noformat}
try {
if (zk && !ZKAssign.checkClosingState(server.getZooKeeper(),
regionInfo, expectedVersion)){
// bad znode state
return; // We're node deleting the znode, but it's not ours...
}
// TODO: If we need to keep updating CLOSING stamp to prevent against
// a timeout if this is long-running, need to spin up a thread?
if (region.close(abort) == null) {
// This region got closed. Most likely due to a split. So instead
// of doing the setClosedState() below, let's just ignore cont
// The split message will clean up the master state.
LOG.warn("Can't close region: was already closed during close(): " +
regionInfo.getRegionNameAsString());
return;
}
} catch (Throwable t) {
// A throwable here indicates that we couldn't successfully flush the
// memstore before closing. So, we need to abort the server and allow
// the master to split our logs in order to recover the data.
server.abort("Unrecoverable exception while closing region " +
regionInfo.getRegionNameAsString() + ", still finishing close", t);
throw new RuntimeException(t);
}
{noformat}
checkClosingState could throw KeeperException, which will also abort the
server. Currently my patch followed the same logic, but is it the desired logic
(without system specific understanding I couldn't tell)? I'm asking because the
comment of the catch block seems to suggest that the abort is to only defend
against an exception in region.close().
> Possible over-catch of exceptions
> ---------------------------------
>
> Key: HBASE-10813
> URL: https://issues.apache.org/jira/browse/HBASE-10813
> Project: HBase
> Issue Type: Improvement
> Components: regionserver, util
> Affects Versions: 0.96.1
> Reporter: Ding Yuan
> Assignee: Ding Yuan
> Attachments: HBase-10813-trunk.patch
>
>
> There are a few cases found by a tool that are possibly over-catch of
> exceptions, especially those that will abort the server. Over-catching these
> exceptions may unexpectedly abort the server, and may cause problems in the
> future when code in the try-block evolves. I am attaching a patch against
> trunk that constrains the catch blocks to the exact exceptions that were
> thrown.
> My tool actually found one more case in 0.96.1, but I found it has already
> been fixed in trunk:
> {noformat}
> Line: 1175, File: "org/apache/hadoop/hbase/master/SplitLogManager.java"
> 1173: try {
> 1174: Thread.sleep(20);
> 1175:- } catch (Exception ignoreE) {
> 1175:+ } catch (InterruptedException e) {
> 1176: // ignore
> 1177: }
> {noformat}
> Any feedbacks are much appreciated!
--
This message was sent by Atlassian JIRA
(v6.2#6252)