I mean TRUNK. 0.89s have been cut from TRUNK every 3 or 4 weeks or so.
J-D is about to put up our next 0.89. It does not have new loadbalancer. The next release we hope will be 0.90.0RC1. That'll have the new balancer. Feature freeze is this weds. Hopefully it'll be up not too long after that (week or two?) As to running out of TRUNK, you could, but it'd be super risky. I'd say if you are up for risk, wait a little while. We're busy doing stabilization at the mo. St.Ack On Mon, Oct 4, 2010 at 11:00 PM, Jack Levin <[email protected]> wrote: > By trunk, you mean 0.89 or 0.20.6? > > -Jack > > On Mon, Oct 4, 2010 at 10:59 PM, Jack Levin <[email protected]> wrote: >> Full stop of all region servers, restart of master, is what brings it all >> back: >> >> Please attached. Lots of data there, search for 'Shedding'. >> >> -Jack >> >> On Mon, Oct 4, 2010 at 9:42 PM, Stack <[email protected]> wrote: >>> So, required a start/stop to fix balance issue? >>> >>> Can I see master log from around problematic time? >>> >>> (The load balancer has been completely redone in TRUNK) >>> >>> St.Ack >>> >>> On Mon, Oct 4, 2010 at 6:23 PM, Jack Levin <[email protected]> wrote: >>>> http://pastebin.com/suw2QVYg this is OOME event. >>>> >>>> When I started it up, the master eventually stopped shedding to 14 >>>> regions each (used to be 700 on 10 servers), and stayed there for a >>>> while, I wanted 10 minutes, and stopped/started all region servers, >>>> and they came up in 5 minutes. >>>> >>>> -Jack >>>> >>>> On Mon, Oct 4, 2010 at 5:48 PM, Jack Levin <[email protected]> wrote: >>>>> 2010-10-04 17:47:25,449 DEBUG >>>>> org.apache.hadoop.hbase.master.RegionManager: Server(s) are carrying >>>>> only 2 regions. Server mtab5.prod.imageshack.com,60020,1285878100774 >>>>> is most loaded (290). Shedding 32 regions to pass to least loaded >>>>> (numMoveToLowLoaded=177) >>>>> >>>>> >>>>> I observe that number of loaded regions sheds pretty much to zero >>>>> before starting back up (taking long time in the process), even though >>>>> I had server that OOME'ed started up again. It seems to be there >>>>> might be a bug in rebalancing logic? >>>>> >>>>> -Jack >>>>> >>>> >>> >> >
