Its not in the manual yet Vidhya. Assignment has completely changed in 0.90. No more do we assign by adding payload to the heartbeat. Now we assign by direct rpc from master to regionserver with master and regionserver moving the region through state changes up in zk until region is successfully opened (OFFLINE->OPENING->OPENED -- with OPENING done a few times to make sure there has not been intercession because operations were taking too long).
Let me explain more: In new master, on fresh startup (as opposed to a master joining an already running cluster), after waiting on regionserver checkin and having assigned root and meta, we'll take a distinct 'fresh startup' code path. We scan .META. for all entries. We then give all entries to the load balancer. It produces an assignment plan that is per regionserver based. We then fire up a little executor service that will run a bounded number of threads concurrently. Each running thread manages the bulk assign of regions to a particular regionserver. Assignment now can be in general a little slower because all state transitions are mediated via zookeeper rather than in-memory in the master. But, in this special bulk assign startup mode, we make use of zk's async ops so we do bulk state transition changes up in zk rather manage individual changes so all runs faster. There is a new rpc where we can dump on the RS all the regions its to open. ZK timeouts during this startup phase are all extended. The bulk assign threads per regionserver stay up until all regions have opened on an individual regionserver, then the executor finishes and the next runs (We could be better here -- especially on a cluster of 700 nodes). I spent time timing this stuff and I'd say bulk assign even with the async zk ops is probably slower than how we used do it but not by much. The new master logs are very different than the old so it might take a while getting your head around whats going on. Hopefully you can avoid having to do this. What are you seeing? St.Ack On Mon, Jan 31, 2011 at 10:19 AM, Vidhyashankar Venkataraman <[email protected]> wrote: > Yes, I will file an issue after collecting the right logs. > > We will try finding the cause of the META server choke. > > Another question: the master still seems to be taking (a lot of) time to load > the table during startup: I found that the regions percheckin config variable > isnt used anymore. I havent looked at that part of the code yet, but what is > now the master's part in assigning regions in 0.90? (Can you let me know if > they are explained in Hbase docs in the release?) > > Thank you > Vidhya > > On 1/31/11 10:06 AM, "Stack" <[email protected]> wrote: > > On Mon, Jan 31, 2011 at 9:54 AM, Vidhyashankar Venkataraman > <[email protected]> wrote: >> The Hbase cluster doesn't have the master problems with hadoop-append turned >> on: we will try finding out why it wasn't working with a non-append version >> of hadoop (with a previous version of hadoop, it was getting stuck while >> splitting logs). >> > > I'd say don't bother Vidhya. You should run w/ append anyways. > Meantime file an issue if you don't mind and dump in there your data. > We need to take care of this so others don't trip over what you saw. > I'm sure that plenty of users will innocently try to bring up 0.90.0 > on an Hadoop w/o append. > >> But there are other issues now (with append turned on) which we are trying >> to resolve. The region server that's hosting the META region is getting >> choked after a table was loaded with around 100 regions per server (this is >> likely the target load that we wanted to have and this worked in 0.89 with >> the same number of nodes and Hbase 0.90 worked fine with 40 nodes and that's >> why I started straight with this number). The node can be pinged, but not >> accessible through ssh and I am unable to perform most hbase operations on >> the cluster as a result. >> >> Can the RS hosting META be a potential bottleneck in the system at all? (I >> will try shutting down that particular node and see what happens). >> > > At 700 nodes scale, its quite possible we're doing something dumb. > Any data you can glean to help us here would be appreciated. I'd have > thought that 0.90.0 would put less load on .META. since we've removed > some of the reasons for .META. access. > > St.Ack > >
