I just posted the patch to https://review.cloudera.org/r/750/. Its a little on the large size (1.5MB. Sorry about that).
The bulk of the patch is by Karthik Ranganathan and Jon Gray. They've been working on it in the 0.90_master_rewrite branch with a good few months now. Its been reviewed pretty extensively, multiple times, but its too big for any one individual to review in anything but a cursory manner in its current form (Again, sorry about that). Piece-mealing the changes into the code base was tried but getting all of the stepped changes in was going to take eons to complete and when we tried it, it wasn't working well anyways -- reviewers had a hard time getting their heads around partial feature implementations and groundwork baffled when the superstructure wasn't coming till a later stage. This patch addresses issues head on that have plagued us for what seems like ages now --- troublesome assignment of regions in particular -- and IMO in spite of its size and lack of review, unless objection, I'm going to go ahead and commit this patch tomorrow or the day after, after all tests pass. We could let this monster stew out on the branch for another couple or weeks or a month but IMO, lts mature enough to be added to TRUNK so we can all work on the stabilization that will get us to 0.90.0 Release Candidate. See the umbrella issue for all thats addressed -- about 11 or 12 issues in all, a few of them blockers -- but here is a synopsis of what the patch includes: + Region in transition data structure is now kept out in zookeeper to facilitate master failover and to do away with race conditions that used result in double assignment of regions + Open and close of regions as well as server shutdown handling and table transitions are now done in Executors; config. says how much parallellism to run with. Default is 3 openers, 3 closers, with designated handlers for meta and root opening, etc. (We used to be single-threaded in master and regionserver doing opens/closes, etc.) + New load balancer; features include figuring out the plan on startup and then assigning out all regions in the one assignment. New method in admin tool allows you unload region from one server and assign it to another explicit server. + Most of what passed over the heartbeating mechanism has now moved to go via zk or the master directly invokes rpc to close/open regions rather than wait on heartbeat to come around There is more including a bunch of cleanup and refactorings that in particular facilitate testing, and this patch lays the ground work for new features coming down the pipeline (the same executor/handler mechanism will get us parallel flushing, splitting and compacting). Things will look different after this patch goes in, there are lots of zk transitions in logs now, and this patch is going to drum up new kinds of bugs but after a week of gung-ho bug bashing we should have ourselves a more robust hbase. St.Ack