So I'm guessing that the log you pasted was from the master, and I can see the zookeeper doing retries and strangely enough it was kicked out by the other ZK peers:
2011-04-21 14:48:26,043 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to 162-2-77-0/162.2.77.0:2181, initiating session 2011-04-21 14:48:26,044 INFO org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x12f76bfa98a0007, likely server has closed socket, closing socket connection and attempting reconnect What's going on in the ZK logs during that time. It seems to me that something strange is going on. Also if the line you pointed out did throw an exception, it should have been sent back to the client... what was it? SessionExpired? To me it looks like HBase did the right thing, it refused to create a table because it wasn't table to write to zookeeper. but the reason for why it wasn't able to isn't clear to me. J-D On Sat, Apr 23, 2011 at 1:35 AM, Gaojinchao <[email protected]> wrote: > In this case: > 1. creating table with regions . > 2. zk crashed ( My cluster has 3 zk process , when leader crashed) > 3.creating table failed. > > I had found the code: > > Please confirm whether it need fixed? > > for(HRegionInfo newRegion : newRegions) { > > // 1. Set table enabling flag up in zk. > try { > assignmentManager.getZKTable().setEnabledTable(tableName); > // This function throw exception. > } catch (KeeperException e) { > throw new IOException("Unable to ensure that the table will be" + > " enabled because of a ZooKeeper issue", e); > } > > // 2. Create HRegion > HRegion region = HRegion.createHRegion(newRegion, > fileSystemManager.getRootDir(), conf); > > // 3. Insert into META > MetaEditor.addRegionToMeta(catalogTracker, region.getRegionInfo()); > > // 4. Close the new region to flush to disk. Close log file too. > region.close(); > region.getLog().closeAndDelete(); > } > > > > -----邮件原件----- > 发件人: [email protected] [mailto:[email protected]] 代表 Jean-Daniel Cryans > 发送时间: 2011年4月23日 3:10 > 收件人: [email protected] > 主题: Re: Creating table with regions failed when zk crashed. > > What exactly happened here? As much as I enjoy reading logs, I also > enjoy short descriptions of the context of what I'm looking at. > > J-D > > On Thu, Apr 21, 2011 at 8:36 PM, Gaojinchao <[email protected]> wrote: >> Is there any issue about this ? >> >> >> 2011-04-21 14:48:24,676 INFO org.apache.hadoop.hbase.regionserver.HRegion: >> Closed >> wfan_1,3238613814230765,1303367938566.63a83bdc9b55ec115cbc9b4bbe318214. >> 2011-04-21 14:48:24,676 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: >> IPC Server handler 4 on 20000.logSyncer interrupted while waiting for sync >> requests >> 2011-04-21 14:48:24,676 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: >> IPC Server handler 4 on 20000.logSyncer exiting >> 2011-04-21 14:48:24,677 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: >> closing hlog writer in >> hdfs://162.2.6.187:9000/hbase/wfan_1/63a83bdc9b55ec115cbc9b4bbe318214/.logs >> 2011-04-21 14:48:24,719 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: >> Moved 1 log files to /hbase/wfan_1/63a83bdc9b55ec115cbc9b4bbe318214/.oldlogs >> 2011-04-21 14:48:24,728 WARN org.apache.hadoop.hbase.zookeeper.ZKTable: >> Moving table wfan_1 state to enabled but was already enabled >> 2011-04-21 14:48:24,912 INFO org.apache.zookeeper.ClientCnxn: Opening socket >> connection to server /162.2.77.0:2181 >> 2011-04-21 14:48:24,912 INFO org.apache.zookeeper.ClientCnxn: Socket >> connection established to 162-2-77-0/162.2.77.0:2181, initiating session >> 2011-04-21 14:48:24,914 INFO org.apache.zookeeper.ClientCnxn: Unable to read >> additional data from server sessionid 0x12f76bfa98a0007, likely server has >> closed socket, closing socket connection and attempting reconnect >> 2011-04-21 14:48:25,308 INFO org.apache.zookeeper.ClientCnxn: Opening socket >> connection to server /162.2.77.0:2181 >> 2011-04-21 14:48:25,308 INFO org.apache.zookeeper.ClientCnxn: Socket >> connection established to 162-2-77-0/162.2.77.0:2181, initiating session >> 2011-04-21 14:48:25,309 INFO org.apache.zookeeper.ClientCnxn: Unable to read >> additional data from server sessionid 0x12f76bfa98a0004, likely server has >> closed socket, closing socket connection and attempting reconnect >> 2011-04-21 14:48:25,410 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server >> Responder, call createTable({NAME => 'wfan_1', FAMILIES => [{NAME => >> 'value', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', >> COMPRESSION => 'GZ', TTL => '5184000', BLOCKSIZE => '65536', IN_MEMORY => >> 'false', BLOCKCACHE => 'true'}]}, [[B@2637df06) from 162.2.134.26:3705: >> output error >> 2011-04-21 14:48:25,410 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server >> handler 4 on 20000 caught: java.nio.channels.ClosedChannelException >> at >> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126) >> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324) >> at >> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1339) >> at >> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:727) >> at >> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:792) >> at >> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1083) >> >> 2011-04-21 14:48:25,416 INFO org.apache.zookeeper.ClientCnxn: Opening socket >> connection to server /162.2.6.187:2181 >> 2011-04-21 14:48:25,417 WARN org.apache.zookeeper.ClientCnxn: Session >> 0x12f76bfa98a0007 for server null, unexpected error, closing socket >> connection and attempting reconnect >> java.net.ConnectException: Connection refused >> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >> at >> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) >> at >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1130) >> 2011-04-21 14:48:25,678 INFO org.apache.zookeeper.ClientCnxn: Opening socket >> connection to server 162-2-16-6/162.2.16.6:2181 >> 2011-04-21 14:48:25,678 INFO org.apache.zookeeper.ClientCnxn: Socket >> connection established to 162-2-16-6/162.2.16.6:2181, initiating session >> 2011-04-21 14:48:25,679 INFO org.apache.zookeeper.ClientCnxn: Unable to read >> additional data from server sessionid 0x12f76bfa98a0007, likely server has >> closed socket, closing socket connection and attempting reconnect >> 2011-04-21 14:48:26,043 INFO org.apache.zookeeper.ClientCnxn: Opening socket >> connection to server 162-2-77-0/162.2.77.0:2181 >> 2011-04-21 14:48:26,043 INFO org.apache.zookeeper.ClientCnxn: Socket >> connection established to 162-2-77-0/162.2.77.0:2181, initiating session >> 2011-04-21 14:48:26,044 INFO org.apache.zookeeper.ClientCnxn: Unable to read >> additional data from server sessionid 0x12f76bfa98a0007, likely server has >> closed socket, closing socket connection and attempting reconnect >> 2011-04-21 14:48:26,297 INFO org.apache.zookeeper.ClientCnxn: Opening socket >> connection to server /162.2.6.187:2181 >> 2011-04-21 14:48:26,298 WARN org.apache.zookeeper.ClientCnxn: Session >> 0x12f76bfa98a0004 for server null, unexpected error, closing socket >> connection and attempting reconnect >> java.net.ConnectException: Connection refused >> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >> at >> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) >> at >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1130) >> 2011-04-21 14:48:26,472 INFO org.apache.zookeeper.ClientCnxn: Opening socket >> connection to server 162-2-16-6/162.2.16.6:2181 >> 2011-04-21 14:48:26,472 INFO org.apache.zookeeper.ClientCnxn: Socket >> connection established to 162-2-16-6/162.2.16.6:2181, initiating session >> 2011-04-21 14:48:26,474 INFO org.apache.zookeeper.ClientCnxn: Session >> establishment complete on server 162-2-16-6/162.2.16.6:2181, sessionid = >> 0x12f76bfa98a0004, negotiated timeout = 40000 >> 2011-04-21 14:48:27,052 INFO org.apache.zookeeper.ClientCnxn: Opening socket >> connection to server 162-2-6-187/162.2.6.187:2181 >> 2011-04-21 14:48:27,053 WARN org.apache.zookeeper.ClientCnxn: Session >> 0x12f76bfa98a0007 for server null, unexpected error, closing socket >> connection and attempting reconnect >> java.net.ConnectException: Connection refused >> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >> at >> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) >> at >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1130) >> 2011-04-21 14:48:27,318 INFO org.apache.zookeeper.ClientCnxn: Opening socket >> connection to server 162-2-16-6/162.2.16.6:2181 >> >
