>
> The config name was/is   hbase.hregion.max.*filesize* and never *
> hbase.hregion.max.size*.
>

Description for hbase.hregion.max.filesize is very clear stating that it's
the sum of all hfiles in the region that should not exceed this property
value. And we not always use  *hbase.hregion.max.filesize* to determine the
limit, but a MAX_FILESIZE table level descriptor whose description reads as
below, on TableDescriptorBuilder javadoc:

  /**
   * Returns the maximum size upto which a region can grow to after which a
   * region split is triggered. The region size is represented by the size
of
   * the biggest store file in that region.
   *
   * @return max hregion size for table, -1 if not set.
   */

Current IncreasingToUpperBoundRegionSplitPolicy implementation is violating
those configs.

Do we have a consensus on applying #3 for all active branches? If so, I
would instruct HBASE-24530 to proceed as such.



Em dom., 21 de jun. de 2020 às 19:09, Andrew Purtell <
[email protected]> escreveu:

> ‘Filesize’ and ‘size’ are ambiguous. They are open to interpretation and I
> don’t see one as more clear than the other, other than to imply something
> about file level measures being the determining factor. It doesn’t convey
> more semantics beyond that, ie one file trips the limit or the combined
> sizes of all files trips the limit. We can fix that with clarifying
> documentation. While doing so we also have an opportunity to fix something
> if our consensus is the current policy is not the usual user expectation.
>
> So how suboptimal is it? Does a compatibility concern make sense if we
> think this is just broken? Perhaps we can address all concerns by making
> the change in next minor releases and then do those minor releases soon.
>
>
> > On Jun 20, 2020, at 11:06 PM, Anoop John <[email protected]> wrote:
> >
> > I have a concern if we do #3 for all minor versions.  That will be a
> major
> > split behaviour change and can affect so much for tables with many CFs.
> If
> > one adjusted the pre splits so as to avoid further region splits, that
> calc
> > might go wrong once they migrate to new minor versions with this change
> > right?
> > The config name was/is   hbase.hregion.max.*filesize* and never *
> > hbase.hregion.max.size*.  We will have HFiles at CF level and so a max
> > filesize is applicable at CF level.   So even this config name will
> create
> > confusion once we change the calc to consider size at region level (Sum
> of
> > sizes at CFs)
> >
> > Anoop
> >
> >
> >> On Fri, Jun 19, 2020 at 11:44 PM Viraj Jasani <[email protected]>
> wrote:
> >>
> >> Given that SteppingSplitPolicy is the default region split policy,
> removal
> >> of IncreasingToUpperBoundRegionSplitPolicy is going to make things more
> >> complex for master branch if we follow #2.
> >> Hence, I believe we should better go with #3 for all.
> >>
> >>
> >>> On 2020/06/19 17:52:27, Viraj Jasani <[email protected]> wrote:
> >>> Can we do a mix of #2 and #3 i.e remove
> >> IncreasingToUpperBoundRegionSplitPolicy from master, and follow #3 for
> >> branch-2 and all active release branches? If it breaks any compatibility
> >> rules, then we can go with #3 for all.
> >>>
> >>>
> >>> On 2020/06/19 17:33:14, Andrew Purtell <[email protected]> wrote:
> >>>> I vote for #3, and it should be applied to all active code lines.
> >>>>
> >>>>
> >>>> On Fri, Jun 19, 2020 at 3:35 AM Wellington Chevreuil <
> >>>> [email protected]> wrote:
> >>>>
> >>>>> While going through the changes proposed on HBASE-24530, we
> >>>>> observed IncreasingToUpperBoundRegionSplitPolicy
> >>>>> compares hbase.hregion.max.filesize against individual stores within
> >> a
> >>>>> region when deciding whether to split a region or not. For tables
> >> having
> >>>>> multiple families, this can lead to regions much larger than what's
> >>>>> defined by hbase.hregion.max.filesize.
> >>>>>
> >>>>> Current proposal on HBASE-24530 is to add an extra policy that
> >> actually
> >>>>> compares the overall region size (combining all region stores sizes)
> >>>>> against hbase.hregion.max.filesize, but I wonder if it really makes
> >> sense
> >>>>> to keep a policy with current IncreasingToUpperBoundRegionSplitPolicy
> >>>>> behaviour. Would like to hear folks opinions if we should take any
> >> of the
> >>>>> below actions?
> >>>>> 1) Leave IncreasingToUpperBoundRegionSplitPolicy as it is and just
> >> add the
> >>>>> new policy proposed on HBASE-24530;
> >>>>> 2) Make IncreasingToUpperBoundRegionSplitPolicy deprecated and
> >> remove it
> >>>>> from master branch;
> >>>>> 3) Change IncreasingToUpperBoundRegionSplitPolicy to actually
> >> implement the
> >>>>> logic of the new policy proposed on HBASE-24530;
> >>>>>
> >>>>> My view is that the current IncreasingToUpperBoundRegionSplitPolicy
> >>>>> behaviour is a bug, and I vote for #3.
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> Best regards,
> >>>> Andrew
> >>>>
> >>>> Words like orphans lost among the crosstalk, meaning torn from truth's
> >>>> decrepit hands
> >>>>   - A23, Crosstalk
> >>>>
> >>>
> >>
>

Reply via email to