Thanks J-D. I'll keep an eye on the Jira.
> -----Original Message----- > From: [email protected] [mailto:[email protected]] On Behalf Of Jean- > Daniel Cryans > Sent: Monday, April 11, 2011 11:52 > To: [email protected] > Subject: Re: Catching ZK ConnectionLoss with HTable > > I'm cleaning this up in this jira > https://issues.apache.org/jira/browse/HBASE-3755 > > But it's a failure case I haven't seen before, really interesting. > There's a HTable that's created in the guts if HCM that will throw a > ZookeeperConnectionException but it will bubble up as an IOE. I'll try to > address this too in 3755. > > J-D > > On Mon, Apr 11, 2011 at 11:03 AM, Sandy Pratt <[email protected]> wrote: > > Hi all, > > > > I had an issue recently where a scan job I frequently run caught > ConnectionLoss and subsequently failed to recover. > > > > The stack trace looks like this: > > > > 11/04/08 12:20:04 INFO zookeeper.ZooKeeper: Session: > 0x12f2497b00d03d8 > > closed > > 11/04/08 12:20:04 WARN client.HConnectionManager$ClientZKWatcher: No > > longer connected to ZooKeeper, current state: Disconnected > > 11/04/08 12:20:05 INFO zookeeper.ClientCnxn: Opening socket connection > > to server localhost/127.0.0.1:21811 > > 11/04/08 12:20:05 INFO zookeeper.ZooKeeper: Session: > 0x12f2497b00d03d9 > > closed > > 11/04/08 12:20:06 INFO zookeeper.ZooKeeperWrapper: Reconnecting to > > zookeeper > > 11/04/08 12:20:06 INFO zookeeper.ZooKeeper: Initiating client > > connection, connectString=localhost:21811 sessionTimeout=60000 > > watcher=org.apache.hadoop.hbase.z > ookeeper.ZooKeeperWrapper@51127a > > 11/04/08 12:20:06 INFO zookeeper.ClientCnxn: Opening socket connection > > to server localhost/127.0.0.1:21811 > > 11/04/08 12:20:06 WARN zookeeper.ClientCnxn: Session 0x0 for server > > null, unexpected error, closing socket connection and attempting > > reconnect > > java.net.ConnectException: Connection refused > > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > > at > > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) > > at > > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078) > > 11/04/08 12:20:06 WARN zookeeper.ZooKeeperWrapper: Problem getting > > stats for /hbase/rs > > org.apache.zookeeper.KeeperException$ConnectionLossException: > > KeeperErrorCode = ConnectionLoss for /hbase/rs > > at > > org.apache.zookeeper.KeeperException.create(KeeperException.java:90) > > at > > org.apache.zookeeper.KeeperException.create(KeeperException.java:42) > > at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:809) > > at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:837) > > at > > > org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.getRSDirectoryCo > unt > > (ZooKeeperWrapper.java:754) > > at > > org.apache.hadoop.hbase.client.HTable.getCurrentNrHRS(HTable.java:173) > > at > > org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:147) > > at > > > org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java: > 1 > > 02) > > at > > > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.prefetc > > hRegionCache(HConnectionManager.java:732) > > at > > > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateR > > egionInMeta(HConnectionManager.java:783) > > at > > > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateR > > egion(HConnectionManager.java:677) > > at > > > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocat > > eRegion(HConnectionManager.java:650) > > at > > > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getReg > i > > onLocation(HConnectionManager.java:470) > > at > > org.apache.hadoop.hbase.client.ServerCallable.instantiateServer(Server > > Callable.java:57) > > at > > > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getReg > i > > onServerWithRetries(HConnectionManager.java:1145) > > at org.apache.hadoop.hbase.client.HTable.get(HTable.java:503) > > at > > com.adobe.hs.ets.dozer.afs.EtsAfsBuilder.getHBaseTimestamp(EtsAfsBuild > > er.java:215) > > at > > com.adobe.hs.ets.dozer.afs.EtsAfsBuilder.syncHour(EtsAfsBuilder.java:3 > > 10) > > at > > com.adobe.hs.ets.dozer.afs.EtsAfsBuilder.go(EtsAfsBuilder.java:130) > > at BuildAfs.main(BuildAfs.java:43) > > 11/04/08 12:20:07 INFO zookeeper.ClientCnxn: Opening socket connection > > to server localhost/127.0.0.1:21811 > > 11/04/08 12:20:07 WARN zookeeper.ClientCnxn: Session 0x0 for server > > null, unexpected error, closing socket connection and attempting > > reconnect > > java.net.ConnectException: Connection refused > > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > > at > > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) > > at > > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078) > > 11/04/08 12:20:09 INFO zookeeper.ClientCnxn: Opening socket connection > > to server localhost/127.0.0.1:21811 > > 11/04/08 12:20:09 WARN zookeeper.ClientCnxn: Session 0x0 for server > > null, unexpected error, closing socket connection and attempting > > reconnect > > > > It then goes on to retry endlessly. Killing the spinning job and running it > again worked fine, so crashing would be preferable to me over retrying > endlessly. > > > > I'm not especially concerned about what went wrong to cause > ConnectionLoss in the first place, but I am interested in being able to set > some behavior for handling the ZK exceptions elegantly. For example, the > call site in my code leading to the exception is this: > > > > Get get = new Get(Bytes.toBytes(level.rowKeyDateFormat.format(dts))); > > Result result = timestampsTable.get(get); > > > > I suppose this means that if I want to catch ConnectionLoss in my code, I > have to wrap all my gets and puts with that catch block. Or maybe just the > first one? It seems like HTable and friends might be able to catch this > exception in a more central location, maybe somewhere in here: > > > > at > > > org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.getRSDirectoryCo > unt > > (ZooKeeperWrapper.java:754) > > > > I'm running HBase 0.89.20100924+28. Will this issue go away if I upgrade to > a newer version? > > > > Thanks, > > Sandy > >
