Re: Region is out of bounds

Vladimir Rodionov Thu, 04 Dec 2014 05:24:25 -0800

Kevin,

Thank you for your response. This is not a question on how to configure
correctly HBase cluster for write heavy workloads. This is internal HBase
issue - something is wrong in a default logic of compaction selection
algorithm in 0.94-0.98. It seems that nobody has ever tested importing data
with very high hbase.hstore.blockingStoreFiles value (200 in our case).


-Vladimir Rodionov

On Wed, Dec 3, 2014 at 6:38 AM, Kevin O'dell <[email protected]>
wrote:

> Vladimir,
>
>   I know you said, "do not ask me why", but I am going to have to ask you
> why.  The fact you are doing this(this being blocking store files > 200)
> tells me there is something or multiple somethings wrong with your cluster
> setup.  A couple things come to mind:
>
> * During this heavy write period, could we use bulk loads?  If so, this
> should solve almost all of your problems
>
> * 1GB region size is WAY too small, and if you are pushing the volume of
> data you are talking about I would recommend 10 - 20GB region sizes this
> should help keep your region count smaller as well which will result in
> more optimal writes
>
> * Your cluster may be undersized, if you are setting the blocking to be
> that high, you may be pushing too much data for your cluster overall.
>
> Would you be so kind as to pass me a few pieces of information?
>
> 1.) Cluster size
> 2.) Average region count per RS
> 3.) Heap size, Memstore global settings, and block cache settings
> 4.) a RS log to pastebin and a time frame of "high writes"
>
> I can probably make some solid suggestions for you based on the above data.
>
> On Wed, Dec 3, 2014 at 1:04 AM, Vladimir Rodionov <[email protected]>
> wrote:
>
> > This is what we observed in our environment(s)
> >
> > The issue exists in CDH4.5, 5.1, HDP2.1, Mapr4
> >
> > If some one sets # of blocking stores way above default value, say - 200
> to
> > avoid write stalls during intensive data loading (do not ask me , why we
> do
> > this), then
> > one of the regions grows indefinitely and takes more 99% of overall
> table.
> >
> > It can't be split because it still has orphaned reference files. Some of
> a
> > reference files are able to avoid compactions for a long time, obviously.
> >
> > The split policy is IncreasingToUpperBound, max region size is 1G. I do
> my
> > tests on CDH4.5 mostly but all other distros seem have the same issue.
> >
> > My attempt to add reference files forcefully to compaction list in
> > Store.requetsCompaction() when region exceeds recommended maximum size
> did
> > not work out well - some weird results in our test cases (but HBase tests
> > are OK: small, medium and large).
> >
> > What is so special with these reference files? Any ideas, what can be
> done
> > here to fix the issue?
> >
> > -Vladimir Rodionov
> >
>
>
>
> --
> Kevin O'Dell
> Systems Engineer, Cloudera
>

Re: Region is out of bounds

Reply via email to