[ https://issues.apache.org/jira/browse/HBASE-10813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13946396#comment-13946396 ]
Ding Yuan commented on HBASE-10813: ----------------------------------- One comment for this code block: {noformat} try { if (zk && !ZKAssign.checkClosingState(server.getZooKeeper(), regionInfo, expectedVersion)){ // bad znode state return; // We're node deleting the znode, but it's not ours... } // TODO: If we need to keep updating CLOSING stamp to prevent against // a timeout if this is long-running, need to spin up a thread? if (region.close(abort) == null) { // This region got closed. Most likely due to a split. So instead // of doing the setClosedState() below, let's just ignore cont // The split message will clean up the master state. LOG.warn("Can't close region: was already closed during close(): " + regionInfo.getRegionNameAsString()); return; } } catch (Throwable t) { // A throwable here indicates that we couldn't successfully flush the // memstore before closing. So, we need to abort the server and allow // the master to split our logs in order to recover the data. server.abort("Unrecoverable exception while closing region " + regionInfo.getRegionNameAsString() + ", still finishing close", t); throw new RuntimeException(t); } {noformat} checkClosingState could throw KeeperException, which will also abort the server. Currently my patch followed the same logic, but is it the desired logic (without system specific understanding I couldn't tell)? I'm asking because the comment of the catch block seems to suggest that the abort is to only defend against an exception in region.close(). > Possible over-catch of exceptions > --------------------------------- > > Key: HBASE-10813 > URL: https://issues.apache.org/jira/browse/HBASE-10813 > Project: HBase > Issue Type: Improvement > Components: regionserver, util > Affects Versions: 0.96.1 > Reporter: Ding Yuan > Assignee: Ding Yuan > Attachments: HBase-10813-trunk.patch > > > There are a few cases found by a tool that are possibly over-catch of > exceptions, especially those that will abort the server. Over-catching these > exceptions may unexpectedly abort the server, and may cause problems in the > future when code in the try-block evolves. I am attaching a patch against > trunk that constrains the catch blocks to the exact exceptions that were > thrown. > My tool actually found one more case in 0.96.1, but I found it has already > been fixed in trunk: > {noformat} > Line: 1175, File: "org/apache/hadoop/hbase/master/SplitLogManager.java" > 1173: try { > 1174: Thread.sleep(20); > 1175:- } catch (Exception ignoreE) { > 1175:+ } catch (InterruptedException e) { > 1176: // ignore > 1177: } > {noformat} > Any feedbacks are much appreciated! -- This message was sent by Atlassian JIRA (v6.2#6252)