Re: avoiding hot spot for timestamp prefix key

Vladimir Rodionov Fri, 22 May 2015 14:04:10 -0700

RegionSplitPolicy only allows you to customize split point (row key). All
rows above this split point will go to
the first daughter region, below - to the second.


The answer on original question is - No, you can not have your custom
policy based on a second part of a key.

-Vlad

On Fri, May 22, 2015 at 2:43 AM, Michael Segel <[email protected]>
wrote:

> This is why I created HBASE-12853.
>
> So you don’t have to specify a custom split policy.
>
> Of course the simple solutions are often passed over because of NIH.  ;-)
>
> To be blunt… You encapsulate the bucketing code so that you have a single
> API in to HBase regardless of the type of storage underneath.
> KISS is maintained and you stop people from attempting to do stupid
> things.   (cc’ing dev@hbase) As a product owner, (read PMC / committers)
> you want to keep people from mucking about in the internals.  While its
> true that its open source, and you will have some who want to muck around,
> you also have to consider the corporate users who need something that is
> reliable and less customized so that its supportable.  This is the vendor’s
> dilemma. (hint Cloudera , Horton, IBM, MapR)  You’re selling support to
> HBase and if a customer starts to overload internals with their own code,
> good luck in supporting it.  This is why you do things like 12853 because
> it makes your life easier.
>
> This isn’t a sexy solution. Its core engineering work.
>
> HTH
>
> -Mike
>
> > On May 22, 2015, at 4:22 AM, Shushant Arora <[email protected]>
> wrote:
> >
> > since custom split policy is based on second part i.e guid so key with
> > first part as 2015-05-22 00:01:02 will be in which region how will that
> be
> > identified?
> >
> >
> > On Fri, May 22, 2015 at 1:12 PM, Ted Yu <[email protected]> wrote:
> >
> >> The custom split policy needs to respect the fact that timestamp is the
> >> leading part of the rowkey.
> >>
> >> This would avoid the overlap you mentioned.
> >>
> >> Cheers
> >>
> >>
> >>
> >>> On May 21, 2015, at 11:55 PM, Shushant Arora <
> [email protected]>
> >> wrote:
> >>>
> >>> guid change with every key, patterns is
> >>> 2015-05-22 00:02:01#AB12EC77778888945
> >>> 2015-05-22 00:02:02#CD9870001234AB457
> >>>
> >>> When we specify custom split algorithm , it may happen that keys of
> same
> >>> sorting order range say (1-7) lies in region R1 as well as in region
> R2?
> >>> Then how .META. table will make further lookups at read time,  say I
> >> search
> >>> for key 3, then will it search in both the regions R1 and R2 ?
> >>>
> >>>> On Fri, May 22, 2015 at 10:48 AM, Ted Yu <[email protected]> wrote:
> >>>>
> >>>> Does guid change with every key ?
> >>>>
> >>>> bq. use second part of key
> >>>>
> >>>> I don't think so. Suppose first row in the parent region is
> >>>> '1432104178817#321'. After split, the first row in first daughter
> region
> >>>> would still be '1432104178817#321'. Right ?
> >>>>
> >>>> Cheers
> >>>>
> >>>> On Thu, May 21, 2015 at 9:57 PM, Shushant Arora <
> >> [email protected]
> >>>> wrote:
> >>>>
> >>>>> Can I avoid hotspot of region with custom region split policy in
> hbase
> >>>>>> 0.96 .
> >>>>>
> >>>>> Key is of the form timestamp#guid.
> >>>>> So can I have custom region split policy and use second part of key
> >> (i.e)
> >>>>> guid as region split criteria and avoid hot spot??
> >>>>
> >>
>
>

Re: avoiding hot spot for timestamp prefix key

Reply via email to