[
https://issues.apache.org/jira/browse/HBASE-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017749#comment-13017749
]
Gary Helmling commented on HBASE-3755:
--------------------------------------
+1
Patch looks good to me.
I get the feeling we need to clean up the error handling in a number of other
places too. Immediately below your changes in ZooKeeperWatcher constructor,
there is:
{noformat}
ZKUtil.createAndFailSilent(this, assignmentZNode);
ZKUtil.createAndFailSilent(this, rsZNode);
ZKUtil.createAndFailSilent(this, tableZNode);
} catch (KeeperException e) {
LOG.error(prefix("Unexpected KeeperException creating base node"), e);
throw new IOException(e);
}
{noformat}
So if the connection gets setup but then fails we're still (in
HConnectionManager case) wrapping the KeeperException in an IOException and
then wrapping that in a ZooKeeperConnectionException.
Seems like ultimately we should be letting out the KeeperException and handling
it, or wrapping it directly without the intervening IOException. I'm sure
there are other cases out there as well.
Outside the scope of this change though. Given the lower client connection
limit in the ZK distribution (which would show up in CDH packaging I believe
with ZK installed separately), a clear error message in this case specifically
is a big win.
> Catch zk's ConnectionLossException and augment error message with more help
> ---------------------------------------------------------------------------
>
> Key: HBASE-3755
> URL: https://issues.apache.org/jira/browse/HBASE-3755
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.1
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Fix For: 0.90.3
>
> Attachments: HBASE-3755.patch
>
>
> 0.90 has a different behavior regarding ZK connections, it tends to create
> too many of them and it's not obvious to users what they should do to fix. I
> think I've helped at least 5 different users this week with this error.
> By catching ConnectionLossException and augmenting its message, we could say
> something like "it's possible that the ZooKeeper server has too many
> connections from this IP, see doc at blah" since the ZK server isn't nice
> enough to let us know what's going on.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira