Hi Jan, Thank you. I hope this did not come through derogative I really meant this in a friendly way (emails sometimes - errr often - do not convey this right).
Lars On Tue, Dec 14, 2010 at 5:00 PM, Jan Lukavský <[email protected]> wrote: > Hi Lars, > > sure, I understand this. :-) > > Thanks. > > On 14.12.2010 16:17, Lars George wrote: >> >> Hi Jan, >> >> Any day now! >> >> Really, there just a few little road bumps but nothing major ad once >> they are resolved it will be released. Just rushing it for the sake of >> releasing it will not make anyone happy (if we find issues right away >> just afterwards). Please bear with us! >> >> Lars >> >> On Tue, Dec 14, 2010 at 10:20 AM, Jan Lukavský >> <[email protected]> wrote: >>> >>> Hi Daniel, >>> >>> I thought that version 0.90.0 would have major rewrites in this area, >>> could >>> you give a rough estimate when the new version will be out? >>> >>> Thanks, >>> Jan >>> >>> On 13.12.2010 20:43, Jean-Daniel Cryans wrote: >>>> >>>> Hi Jan, >>>> >>>> That area of HBase was reworked a lot in the upcoming 0.90.0 and >>>> region opening and closing can now be done in parallel for multiple >>>> regions. >>>> >>>> Also, the balancer works differently and may not even assign a single >>>> region to a new region server (or a dead one that was restarted) until >>>> the balancer runs (it's now every 5 minutes). >>>> >>>> Those behaviors are completely new, so it will probably need better >>>> tuning, and there's still a lot to do regarding region balancing in >>>> general, but it's probably worth trying it out. >>>> >>>> Regarding limiting the number of regions, you probably want to use LZO >>>> (99% of the time it's faster for your tables) and set MAX_FILESIZE to >>>> something like 1GB since the default is pretty low. >>>> >>>> Maybe your new config would be useful too in the new master, I have to >>>> give it more thoughts. >>>> >>>> J-D >>>> >>>> On Mon, Dec 13, 2010 at 8:36 AM, Jan Lukavský >>>> <[email protected]> wrote: >>>>> >>>>> Hi all, >>>>> >>>>> we are using HBase 0.20.6 on a cluster of about 25 nodes with about 30k >>>>> regions and are experiencing as issue which causes running M/R jobs to >>>>> fail. >>>>> When we restart single RegionServer, then happens the following: >>>>> 1) all regions of that RS get reassigned to remaing (say 24) nodes >>>>> 2) when the restarted RegionServer comes up, HMaster closes about 60 >>>>> regions on all 24 nodes and assigns them back to the restarted node >>>>> >>>>> Now, the step 1) is usually very quick (if we can assign 10 regions per >>>>> heartbeat, we have 240 regions per heartbeat on the whole cluster). >>>>> The step 2) seems problematic, because first about 1200 regions get >>>>> unassigned, and then they get slowly assigned to the single RS (speed >>>>> again >>>>> 10 regions per heartbeat). This time causes clients of Maps connected >>>>> to >>>>> the >>>>> regions to throw RetriesExhaustedException. >>>>> >>>>> I'm aware that we can limit number of regions closed per RegionServer >>>>> heartbeat by hbase.regions.close.max, but this config option seems a >>>>> bit >>>>> unsatisfactory, because as we increase size of the cluster, we will get >>>>> more >>>>> and more regions unassigned in single cluster heartbeat (say we limit >>>>> this >>>>> to 1, then we get 24 unassigned regions, but only 10 assigned per >>>>> heartbeat). This led us to a solution, which seems quite simple. We >>>>> have >>>>> introduced new config option which is used to limit number of regions >>>>> in >>>>> transition. When regionsInTransition.size() crosses boundary, we >>>>> temporarily >>>>> stop load balancer. This seems to resolve our issue, because no region >>>>> gets >>>>> unassigned for long time and clients manage to recover within their >>>>> number >>>>> of retries. >>>>> >>>>> My question is, is this s general issue and a new config option should >>>>> be >>>>> proposed, or I am missing something a we could have resolved the issue >>>>> with >>>>> some other config option tuning? >>>>> >>>>> Thanks. >>>>> Jan >>>>> >>>>> >>> >>> -- >>> >>> Jan Lukavský >>> programátor >>> Seznam.cz, a.s. >>> Radlická 608/2 >>> 15000, Praha 5 >>> >>> [email protected] >>> http://www.seznam.cz >>> >>> > > > -- > > Jan Lukavský > programátor > Seznam.cz, a.s. > Radlická 608/2 > 15000, Praha 5 > > [email protected] > http://www.seznam.cz > >
