Based on what u pasted as the config
"<property>
<name>hbase.hregion.max.filesize</name>
<value>10737418240</value>
<description>
Maximum HStoreFile size. If any one of a column families' HStoreFiles
has
grown to exceed this value, the hosting HRegion is split in
two.</description>
</property>"
I can say the issue is the version of HBase.
Older HBase versions had this behave what u said. When a file under a
region's CF grow above the max limit, the region will split. The reason
why the check was like that is we any way try to major compact files under
a CF into one large file. So the check based on larger file was ok/
This way is changed later and we start checking the sum of all files under
a region:cf. Am not sure which version introduced this. This became a
need when we supported feature like Date Tiered Compaction/ Stripe
Compaction.
So for you to have the required behave, try upgrade to a newer version.
Anoop
On Thu, Jun 20, 2019 at 9:55 PM Jean-Marc Spaggiari <[email protected]>
wrote:
> Hi,
>
> Just updating what I said (Thanks Anoop for the warning). I took the
> assumption that you have a single CF... The maxfilesize is per CF, not per
> region. If you have a single CF, then it become the same as per region, but
> a region will split whenever one of the CFs reaches the limit.
>
> HBase will not split a single row. So if you have a single row that grows
> bigger than the maxfilesize, the region will keep growing. You need to
> assess this risk when you do your table design and avoid it. It will not
> split even if there is millions of column qualifiers. A region is defines
> by a start row and a stop row. Therefore a single row can belong only to a
> single region.
>
> JMS
>
> Le jeu. 20 juin 2019 à 05:00, Roshan <[email protected]> a écrit :
>
> > Hi,
> >
> > If the single rowkey in the table exceeds the size of defined
> > hbase.hregion.max.filesize, whether the region will split or not. In this
> > case, what are the performance issues we face in the Cluster?
> >
> > If the rowkey (belongs to single columnfamily) has different Column
> > qualifier also, the Hfile will not split?
> >
> >
> >
> > On Thu, 20 Jun 2019 at 11:38, [email protected] <
> > [email protected]> wrote:
> >
> > > this conf:
> > > <property>
> > > <name>hbase.hregion.max.filesize</name>
> > > <value>10737418240</value>
> > > <description>
> > > Maximum HStoreFile size. If any one of a column families'
> HStoreFiles
> > > has
> > > grown to exceed this value, the hosting HRegion is split in
> > > two.</description>
> > > </property>
> > >
> > >
> > >
> > >
> > >
> > > [email protected]
> > >
> > > From: Jean-Marc Spaggiari
> > > Date: 2019-06-19 06:52
> > > To: user
> > > Subject: Re: question on hfile size upper limit
> > > Hi,
> > >
> > > Can you please confirm which parameter you are talking about? The
> default
> > > HBase setting is to limit the size per region (10GB by default), and
> not
> > by
> > > HFiles. This can be configured at the HBase lever, or at the table
> level.
> > >
> > > HTH,
> > >
> > > JMS
> > >
> > > Le mar. 18 juin 2019 à 11:32, [email protected] <
> > > [email protected]> a écrit :
> > >
> > > > we set size upper limit for hfile, but not region
> > > > so region has different actural size, leading to some analysis task
> has
> > > > different input size
> > > >
> > > > can we set size limit on region
> > > >
> > > >
> > > >
> > > >
> > > > [email protected]
> > > >
> > >
> >
>