Jonathan: Can you publish performance metric (compared with current trunk) from cluster running the new master ?
Thanks On Tue, Aug 31, 2010 at 10:20 AM, Jonathan Gray <jg...@facebook.com> wrote: > Though I'm sure my vote is clear, I'm +1 on this. > > The plan at fb is to update our internal branch to (almost) the current > head of trunk, before the commit of the master branch. Ongoing testing will > continue on this branch. > > In parallel, testing will also begin here on the new master following the > mega commit. > > Hopefully we can transition everything to the new master sooner than later > instead of splitting time. I'd say shortly after initial testing is > complete we should push for a new master 0.89 or 0.90RC and ask users to > test as much as possible. > > > I did as much as possible to try and get reviews along the way, including > several very early design discussions and group code review sessions, but > this is pretty radical change so has not been easy. If you're familiar with > the old BaseScanner, RegionManager, ZooKeeperWrapper, etc. this stuff has > been cut and the replacements are much shorter/simpler. > > Just need to find all the bugs and fill in the oversights :) > > Stack, thanks for carrying this thing over the finish line. > > JG > > > -----Original Message----- > > From: saint....@gmail.com [mailto:saint....@gmail.com] On Behalf Of > > Stack > > Sent: Tuesday, August 31, 2010 12:44 AM > > To: HBase Dev List > > Subject: Heads-up: big commit in next day or so; "HBASE-2692 Master > > rewrite and cleanup for 0.90" > > > > I just posted the patch to https://review.cloudera.org/r/750/. Its a > > little on the large size (1.5MB. Sorry about that). > > > > The bulk of the patch is by Karthik Ranganathan and Jon Gray. They've > > been working on it in the 0.90_master_rewrite branch with a good few > > months now. Its been reviewed pretty extensively, multiple times, but > > its too big for any one individual to review in anything but a cursory > > manner in its current form (Again, sorry about that). Piece-mealing > > the changes into the code base was tried but getting all of the > > stepped changes in was going to take eons to complete and when we > > tried it, it wasn't working well anyways -- reviewers had a hard time > > getting their heads around partial feature implementations and > > groundwork baffled when the superstructure wasn't coming till a later > > stage. > > > > This patch addresses issues head on that have plagued us for what > > seems like ages now --- troublesome assignment of regions in > > particular -- and IMO in spite of its size and lack of review, unless > > objection, I'm going to go ahead and commit this patch tomorrow or the > > day after, after all tests pass. We could let this monster stew out > > on the branch for another couple or weeks or a month but IMO, lts > > mature enough to be added to TRUNK so we can all work on the > > stabilization that will get us to 0.90.0 Release Candidate. > > > > See the umbrella issue for all thats addressed -- about 11 or 12 > > issues in all, a few of them blockers -- but here is a synopsis of > > what the patch includes: > > > > + Region in transition data structure is now kept out in zookeeper to > > facilitate master failover and to do away with race conditions that > > used result in double assignment of regions > > + Open and close of regions as well as server shutdown handling and > > table transitions are now done in Executors; config. says how much > > parallellism to run with. Default is 3 openers, 3 closers, with > > designated handlers for meta and root opening, etc. (We used to be > > single-threaded in master and regionserver doing opens/closes, etc.) > > + New load balancer; features include figuring out the plan on startup > > and then assigning out all regions in the one assignment. New method > > in admin tool allows you unload region from one server and assign it > > to another explicit server. > > + Most of what passed over the heartbeating mechanism has now moved to > > go via zk or the master directly invokes rpc to close/open regions > > rather than wait on heartbeat to come around > > > > There is more including a bunch of cleanup and refactorings that in > > particular facilitate testing, and this patch lays the ground work for > > new features coming down the pipeline (the same executor/handler > > mechanism will get us parallel flushing, splitting and compacting). > > > > Things will look different after this patch goes in, there are lots of > > zk transitions in logs now, and this patch is going to drum up new > > kinds of bugs but after a week of gung-ho bug bashing we should have > > ourselves a more robust hbase. > > > > St.Ack >