[jira] [Commented] (HBASE-6625) If we have hundreds of thousands of regions getChildren will encouter zk exception

Jonathan Hsieh (JIRA) Wed, 22 Aug 2012 16:22:43 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-6625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439936#comment-13439936
 ]


Jonathan Hsieh commented on HBASE-6625:
---------------------------------------

If we lowered regionsplit limit to something 10x what is considered reasonable? 
 Max int is quite large, 100k regions is also quite large.  If you have that 
many regions your are "doing it wrong" or purposely trying to break hbase. 

If we have 100k 10GB regions, this means we have 1 Exabyte (10^15) of region 
data *per* region server.  I believe the largest hdfs clusters haven't gotten 
to that size yet. 

I don't see the point of allowing that to happen (even accidentially).  Setting 
it to something an order of mag larger than reasonable would hold us over for a 
year or so. :)  

                
> If we have hundreds of thousands of  regions getChildren will encouter zk 
> exception
> -----------------------------------------------------------------------------------
>
>                 Key: HBASE-6625
>                 URL: https://issues.apache.org/jira/browse/HBASE-6625
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Zhou wenjian
>            Assignee: Zhou wenjian
>
> 2012-05-13 19:37:37,528 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager$ExistsUnassignedAsyncCallback:
>  rs=CreateNewTableWith100000Regions,\x05\xB3\x06 
> g\xE8r\xBB]\x09\xCF,1336724029944.079cb2f8a375e66fa089291b82f2a03f. 
> state=OFFLINE, ts=1336909053108 
> 2012-05-13 19:37:37,528 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager$CreateUnassignedAsyncCallback:
>  rs=CreateNewTableWith100000Regions,\x08s\x84\x8 
> 8$7\xB1\xC4\xFCg,1336724030660.76c07780231942231013c7feb5e5eb14. 
> state=OFFLINE, ts=1336909055089, server=dw76.kgb.sqa.cm4,60020,1336908983944 
> 2012-05-13 19:37:37,528 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager$CreateUnassignedAsyncCallback:
>  rs=CreateNewTableWith100000Regions,\x08s\x89\xC 
> B\x9B\xF0\xE4\xCA\x97\xB0,1336724030660.fa38b9d8367387a64a327087cb43b3e0. 
> state=OFFLINE, ts=1336909055089, server=dw76.kgb.sqa.cm4,60020,1336908983944 
> 2012-05-13 19:37:37,528 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: 
> dw76.kgb.sqa.cm4,60020,1336908983944 unassigned znodes=58464 of total=120002 
> 2012-05-13 19:37:37,758 WARN org.apache.zookeeper.ClientCnxn: Session 
> 0x13745fc2c8d0001 for server dw51.kgb.sqa.cm4/10.232.98.51:2180, unexpected 
> error, clos 
> ing socket connection and attempting reconnect 
> java.io.IOException: Packet len4320092 is out of range! 
>         at 
> org.apache.zookeeper.ClientCnxn$SendThread.readLength(ClientCnxn.java:710) 
>         at 
> org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:869) 
>         at 
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1130) 
> 2012-05-13 19:37:37,860 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil: 
> master:60000-0x13745fc2c8d0001 Unable to list children of znode 
> /hbase-new4/unassigned 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase-new4/unassigned 
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:90) 
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42) 
>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1243) 
>         at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.listChildrenAndWatchForNewChildren(ZKUtil.java:302)
>  
>         at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndGetNewChildren(ZKUtil.java:413)
>  
>         at 
> org.apache.hadoop.hbase.master.AssignmentManager.nodeChildrenChanged(AssignmentManager.java:759)
>  
>         at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:314)
>  
>         at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530) 
>         at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506) 
> 2012-05-13 19:37:37,861 ERROR 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
> master:60000-0x13745fc2c8d0001 Received unexpected KeeperException, re-thro 
> wing exception 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase-new4/unassigned 
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:90) 
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42) 
>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1243) 
>         at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.listChildrenAndWatchForNewChildren(ZKUtil.java:302)
>  
>         at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndGetNewChildren(ZKUtil.java:413)
>  
>         at 
> org.apache.hadoop.hbase.master.AssignmentManager.nodeChildrenChanged(AssignmentManager.java:759)
>  
>         at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:314)
>  
>         at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530) 
>         at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506) 
> 2012-05-13 19:37:37,861 FATAL org.apache.hadoop.hbase.master.HMaster: 
> Unexpected ZK exception reading unassigned children 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase-new4/unassigned 
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:90) 
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42) 
>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1243) 
>         at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.listChildrenAndWatchForNewChildren(ZKUtil.java:302)
>  
>         at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndGetNewChildren(ZKUtil.java:413)
>  
>         at 
> org.apache.hadoop.hbase.master.AssignmentManager.nodeChildrenChanged(AssignmentManager.java:759)
>  
>         at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:314)
>  
>         at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530) 
>         at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506) 
> 2012-05-13 19:37:37,861 INFO org.apache.hadoop.hbase.master.HMaster: Aborting

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6625) If we have hundreds of thousands of regions getChildren will encouter zk exception

Reply via email to