Hi,

No one has run HBase at that scale that I know of.  Perhaps you can
poke at the problem some more and come up with a working theory and
possibly a code fix?

Good luck!
-ryan

On Fri, Aug 27, 2010 at 5:01 PM, Vidhyashankar Venkataraman
<[email protected]> wrote:
> Hi guys,
>  A couple of days back, I had posted a problem on regions taking too much 
> time to load when I restart Hbase.. I have a table that has around 80 K 
> regions on 650 nodes (!) ..
>  I was checking the logs in the master and I notice that the time it takes 
> from assigning a region to a region server to the point when it recognizes 
> that the region is open in that server takes around 20-30 minutes!
>   Apart from master being the bottleneck here, can you guys let me know what 
> the other possible cases are as to why this may happen?
>
> Thank you
> Vidhya
>
> Below is an example for region with start key 0000003404803994 where the 
> assignment takes place at 22:59 while the confirmation that it got open came 
> at 23:19...
>
> 2010-08-27 22:59:02,642 DEBUG org.apache.hadoop.hbase.master.BaseScanner: 
> Current assignment of 
> DocDB,0000003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. is not 
> valid;  serverAddress=, startCode=0 unknown.
> 2010-08-27 22:59:02,643 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Creating UNASSIGNED 
> region 73c0f8fdb8ffbc20b9a239d325932ff1 in state = M2ZK_REGION_OFFLINE
> 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.master.HMaster: Event 
> NodeCreated with state SyncConnected with path 
> /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1
> 2010-08-27 22:59:02,645 DEBUG 
> org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event NodeCreated 
> with path /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1
> 2010-08-27 22:59:02,645 DEBUG 
> org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS: Got 
> zkEvent NodeCreated state:SyncConnected 
> path:/hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1
> 2010-08-27 22:59:02,645 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: 
> <b3130520.yst.yahoo.net,b3130560.yst.yahoo.net,b3130600.yst.yahoo.net,b3130640.yst.yahoo.net,b3130680.yst.yahoo.net:/hbase,org.apache.hadoop.hbase.master.HMaster>Created
>  ZNode /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 in ZooKeeper
> 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
> Created/updated UNASSIGNED zNode 
> DocDB,0000003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. in 
> state M2ZK_REGION_OFFLINE
> 2010-08-27 22:59:02,646 DEBUG 
> org.apache.hadoop.hbase.master.ZKUnassignedWatcher: Got event type [ 
> M2ZK_REGION_OFFLINE ] for region 73c0f8fdb8ffbc20b9a239d325932ff1
> 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.HMaster: Event 
> NodeChildrenChanged with state SyncConnected with path /hbase/UNASSIGNED
> 2010-08-27 22:59:02,646 DEBUG 
> org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event 
> NodeChildrenChanged with path /hbase/UNASSIGNED
> 2010-08-27 22:59:02,646 DEBUG 
> org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS: Got 
> zkEvent NodeChildrenChanged state:SyncConnected path:/hbase/UNASSIGNED
> 2010-08-27 22:59:02,647 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
> Assigning for serverName=b3130020.yst.yahoo.net,60020,1282940735627, 
> load=(requests=0, regions=76, usedHeap=80, maxHeap=7993): total nregions to 
> assign=1, regions to give other servers than this=0, isMetaAssign=false
> 2010-08-27 22:59:02,647 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
> Assigning serverName=b3130020.yst.yahoo.net,60020,1282940735627, 
> load=(requests=0, regions=76, usedHeap=80, maxHeap=7993) 1 regions
> 2010-08-27 22:59:02,647 INFO org.apache.hadoop.hbase.master.RegionManager: 
> Assigning region 
> DocDB,0000003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. to 
> b3130020.yst.yahoo.net,60020,1282940735627
> 2010-08-27 22:59:02,653 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: No need to update 
> UNASSIGNED region 73c0f8fdb8ffbc20b9a239d325932ff1 as it already exists in 
> state = M2ZK_REGION_OFFLINE
> 2010-08-27 22:59:02,653 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
> Created UNASSIGNED zNode 
> DocDB,0000003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. in 
> state M2ZK_REGION_OFFLINE
>
>
>
>
> AND THEN,
>
> 2010-08-27 23:18:53,591 INFO 
> org.apache.hadoop.hbase.master.RegionServerOperation: 
> DocDB,0000003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. open 
> on b3130020.yst.yahoo.net,60020,1282940735627
> 2010-08-27 23:18:53,591 INFO 
> org.apache.hadoop.hbase.master.RegionServerOperation: Updated row 
> DocDB,0000003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. in 
> region .META.,,1 with startcode=1282940735627, 
> server=b3130020.yst.yahoo.net:60020
> 2010-08-27 23:18:53,677 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Deleting ZNode 
> /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 in ZooKeeper as region is 
> open...
>
>
>

Reply via email to