Re: Improving on MTTR of cluster [Hbase - 1.1.13]

2018-09-27 Thread sahil aggarwal
just FYI, increasing hbase.regionserver.executor.openregion.threads helped
significantly(from 20-25 mins to <2mins for ~2200 regions on 4 RS).

Have created patch to document this
https://jira.apache.org/jira/projects/HBASE/issues/HBASE-21186?filter=myopenissues
.

On Tue, 11 Sep 2018 at 17:33, sahil aggarwal  wrote:

> Thanks Ted.
>
> Even regarding the field hbase.assignment.usezk=true, it seems like it
> requires hbase:meta and hmaster to be co-hosted but
> http://hbase.apache.org/2.0/book.html#upgrade2.0.regions.on.master this
> says that "Master hosting regions" feature broken and unsupported.
>
> Is there anything else I can tap into to speedup region assignment?
>
>
> On Mon, 10 Sep 2018 at 21:33, Ted Yu  wrote:
>
>> For the second config you mentioned, hbase.master.distributed.log.replay,
>> see http://hbase.apache.org/book.html#upgrade2.0.distributed.log.replay
>>
>> FYI
>>
>> On Mon, Sep 10, 2018 at 8:52 AM sahil aggarwal 
>> wrote:
>>
>> > Hi,
>> >
>> > My cluster has around 50k regions and 130 RS. In case of unclean
>> shutdown,
>> > the cluster take around 40 50 mins to come up(mostly slow on region
>> > assignment from observation). Trying to optimize it found following
>> > possible configs:
>> >
>> > *hbase.assignment.usezk:* which will co-host meta table and Hmaster and
>> > avoid zk interaction for region assignment.
>> > *hbase.master.distributed.log.replay:* to replay the edit logs in
>> > distributed manner.
>> >
>> >
>> > Testing *hbase.assignment.usezk* alone on small cluster(2200 regions, 4
>> RS)
>> > gave following results:
>> >
>> > hbase.assignment.usezk=true -> 12 mins
>> > hbase.assignment.usezk=false -> 9 mins
>> >
>> >
>> > From this blog
>> > ,
>> i
>> > was expecting better results so probably I am missing something. Will
>> > appreciate any pointers.
>> >
>> > Thanks,
>> > Sahil
>> >
>>
>


Re: Improving on MTTR of cluster [Hbase - 1.1.13]

2018-09-11 Thread sahil aggarwal
Thanks Ted.

Even regarding the field hbase.assignment.usezk=true, it seems like it
requires hbase:meta and hmaster to be co-hosted but
http://hbase.apache.org/2.0/book.html#upgrade2.0.regions.on.master this
says that "Master hosting regions" feature broken and unsupported.

Is there anything else I can tap into to speedup region assignment?


On Mon, 10 Sep 2018 at 21:33, Ted Yu  wrote:

> For the second config you mentioned, hbase.master.distributed.log.replay,
> see http://hbase.apache.org/book.html#upgrade2.0.distributed.log.replay
>
> FYI
>
> On Mon, Sep 10, 2018 at 8:52 AM sahil aggarwal 
> wrote:
>
> > Hi,
> >
> > My cluster has around 50k regions and 130 RS. In case of unclean
> shutdown,
> > the cluster take around 40 50 mins to come up(mostly slow on region
> > assignment from observation). Trying to optimize it found following
> > possible configs:
> >
> > *hbase.assignment.usezk:* which will co-host meta table and Hmaster and
> > avoid zk interaction for region assignment.
> > *hbase.master.distributed.log.replay:* to replay the edit logs in
> > distributed manner.
> >
> >
> > Testing *hbase.assignment.usezk* alone on small cluster(2200 regions, 4
> RS)
> > gave following results:
> >
> > hbase.assignment.usezk=true -> 12 mins
> > hbase.assignment.usezk=false -> 9 mins
> >
> >
> > From this blog
> > ,
> i
> > was expecting better results so probably I am missing something. Will
> > appreciate any pointers.
> >
> > Thanks,
> > Sahil
> >
>


Re: Improving on MTTR of cluster [Hbase - 1.1.13]

2018-09-10 Thread Ted Yu
For the second config you mentioned, hbase.master.distributed.log.replay,
see http://hbase.apache.org/book.html#upgrade2.0.distributed.log.replay

FYI

On Mon, Sep 10, 2018 at 8:52 AM sahil aggarwal 
wrote:

> Hi,
>
> My cluster has around 50k regions and 130 RS. In case of unclean shutdown,
> the cluster take around 40 50 mins to come up(mostly slow on region
> assignment from observation). Trying to optimize it found following
> possible configs:
>
> *hbase.assignment.usezk:* which will co-host meta table and Hmaster and
> avoid zk interaction for region assignment.
> *hbase.master.distributed.log.replay:* to replay the edit logs in
> distributed manner.
>
>
> Testing *hbase.assignment.usezk* alone on small cluster(2200 regions, 4 RS)
> gave following results:
>
> hbase.assignment.usezk=true -> 12 mins
> hbase.assignment.usezk=false -> 9 mins
>
>
> From this blog
> , i
> was expecting better results so probably I am missing something. Will
> appreciate any pointers.
>
> Thanks,
> Sahil
>