Vidhya,

Could you post a snippet of an RS log during this time?  You should be able to 
see what's happening between when the OPEN message gets there and the OPEN 
completes.

Like Stack said, it's probably that its single-threaded in the version you're 
using and with all the file opening, your NN and DNs are under heavy load.  Do 
you see io-wait or anything else jump up across the cluster at this time?  You 
have ganglia setup on this bad boy?

JG

> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of
> Stack
> Sent: Friday, August 27, 2010 5:36 PM
> To: [email protected]
> Subject: Re: Initial region loads in hbase..
> 
> In 0.20, open on a regionserver is single-threaded.  Could that be it?
>  The server has lots of regions to open and its taking time?  Is the
> meta table being beat up?  Could this be holding up region opens?
> 
> Good luck V,
> St.Ack
> 
> 
> On Fri, Aug 27, 2010 at 5:01 PM, Vidhyashankar Venkataraman
> <[email protected]> wrote:
> > Hi guys,
> >  A couple of days back, I had posted a problem on regions taking too
> much time to load when I restart Hbase.. I have a table that has around
> 80 K regions on 650 nodes (!) ..
> >  I was checking the logs in the master and I notice that the time it
> takes from assigning a region to a region server to the point when it
> recognizes that the region is open in that server takes around 20-30
> minutes!
> >   Apart from master being the bottleneck here, can you guys let me
> know what the other possible cases are as to why this may happen?
> >
> > Thank you
> > Vidhya
> >
> > Below is an example for region with start key 0000003404803994 where
> the assignment takes place at 22:59 while the confirmation that it got
> open came at 23:19...
> >
> > 2010-08-27 22:59:02,642 DEBUG
> org.apache.hadoop.hbase.master.BaseScanner: Current assignment of
> DocDB,0000003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1.
> is not valid;  serverAddress=, startCode=0 unknown.
> > 2010-08-27 22:59:02,643 DEBUG
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Creating UNASSIGNED
> region 73c0f8fdb8ffbc20b9a239d325932ff1 in state = M2ZK_REGION_OFFLINE
> > 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.master.HMaster:
> Event NodeCreated with state SyncConnected with path
> /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1
> > 2010-08-27 22:59:02,645 DEBUG
> org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event
> NodeCreated with path
> /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1
> > 2010-08-27 22:59:02,645 DEBUG
> org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS:
> Got zkEvent NodeCreated state:SyncConnected
> path:/hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1
> > 2010-08-27 22:59:02,645 DEBUG
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper:
> <b3130520.yst.yahoo.net,b3130560.yst.yahoo.net,b3130600.yst.yahoo.net,b
> 3130640.yst.yahoo.net,b3130680.yst.yahoo.net:/hbase,org.apache.hadoop.h
> base.master.HMaster>Created ZNode
> /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 in ZooKeeper
> > 2010-08-27 22:59:02,646 DEBUG
> org.apache.hadoop.hbase.master.RegionManager: Created/updated
> UNASSIGNED zNode
> DocDB,0000003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1.
> in state M2ZK_REGION_OFFLINE
> > 2010-08-27 22:59:02,646 DEBUG
> org.apache.hadoop.hbase.master.ZKUnassignedWatcher: Got event type [
> M2ZK_REGION_OFFLINE ] for region 73c0f8fdb8ffbc20b9a239d325932ff1
> > 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.HMaster:
> Event NodeChildrenChanged with state SyncConnected with path
> /hbase/UNASSIGNED
> > 2010-08-27 22:59:02,646 DEBUG
> org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event
> NodeChildrenChanged with path /hbase/UNASSIGNED
> > 2010-08-27 22:59:02,646 DEBUG
> org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS:
> Got zkEvent NodeChildrenChanged state:SyncConnected
> path:/hbase/UNASSIGNED
> > 2010-08-27 22:59:02,647 DEBUG
> org.apache.hadoop.hbase.master.RegionManager: Assigning for
> serverName=b3130020.yst.yahoo.net,60020,1282940735627,
> load=(requests=0, regions=76, usedHeap=80, maxHeap=7993): total
> nregions to assign=1, regions to give other servers than this=0,
> isMetaAssign=false
> > 2010-08-27 22:59:02,647 DEBUG
> org.apache.hadoop.hbase.master.RegionManager: Assigning
> serverName=b3130020.yst.yahoo.net,60020,1282940735627,
> load=(requests=0, regions=76, usedHeap=80, maxHeap=7993) 1 regions
> > 2010-08-27 22:59:02,647 INFO
> org.apache.hadoop.hbase.master.RegionManager: Assigning region
> DocDB,0000003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1.
> to b3130020.yst.yahoo.net,60020,1282940735627
> > 2010-08-27 22:59:02,653 DEBUG
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: No need to update
> UNASSIGNED region 73c0f8fdb8ffbc20b9a239d325932ff1 as it already exists
> in state = M2ZK_REGION_OFFLINE
> > 2010-08-27 22:59:02,653 DEBUG
> org.apache.hadoop.hbase.master.RegionManager: Created UNASSIGNED zNode
> DocDB,0000003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1.
> in state M2ZK_REGION_OFFLINE
> >
> >
> >
> >
> > AND THEN,
> >
> > 2010-08-27 23:18:53,591 INFO
> org.apache.hadoop.hbase.master.RegionServerOperation:
> DocDB,0000003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1.
> open on b3130020.yst.yahoo.net,60020,1282940735627
> > 2010-08-27 23:18:53,591 INFO
> org.apache.hadoop.hbase.master.RegionServerOperation: Updated row
> DocDB,0000003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1.
> in region .META.,,1 with startcode=1282940735627,
> server=b3130020.yst.yahoo.net:60020
> > 2010-08-27 23:18:53,677 DEBUG
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Deleting ZNode
> /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 in ZooKeeper as
> region is open...
> >
> >
> >

Reply via email to