[ 
https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HADOOP-2731:
--------------------------

    Attachment: split-v8.patch

Problem: We have no governor on flushes so its possible -- especially after all 
the performance improvements since 0.15.x -- to flush at a rate that overwhelms 
the rate at which we compact store files

The attached patch does not solve the overrun problem.  It does its best at 
ameliorating the problem by making splits happen more promptly and then post 
split, makes it so when region is quiescent, if store files of > 256M -- even 
if only 1 of them -- we'll split.

More detail on patch:

+ HADOOP-2712 set a flag such that if a split was queued, we'd stop further 
compactions.  Turns out that more often than not, it was possible for one last 
compaction to start before the split ran.  This patch puts back together the 
compacting and splitting thread; a check if split is needed will always follow 
a com
paction check (event-driven this was not guaranteed when the threads were 
distinct).  Also made it possible to split though no compaction was run 
(Removed the 2172 addition.  Was too subtle).
+ Flushes could also get in the way of a split so now flushes are blocked too 
when split is queued.
+ On open, check if region needs to be compacted (Previous this check was only 
done after first flush -- which could be 20/30s out
+ Made it so we split if > 256M, not if > 1.5 * 256M. Set the multiplier on 
flushes to be 1 instead of 2 so we flush at 64M, not 64M plus some slop.  
Regularizes split and flushes.
+ Make it so we'll split if only one file > 256M and that we'll compact if only 
one file but it has references to parent region

I tried Billy's suggestion of putting a cap on number of mapfiles to compact in 
the one go.  We need to do more work though before we can make use of this 
suggested technique because regions that hold references are not splittable: I 
was compacting the N oldest, then on second compactions, would do N oldest 
again. but the remainder could have references to parent regions and so 
couldn't be split.  Meantime we'd accumulate flush files -- the region would 
never split and the count of flush files would overwhelm the compactor.  Need 
to be smarter and do as Billy suggests and pick up the small files)

> [hbase] Under extreme load, regions become extremely large and eventually 
> cause region servers to become unresponsive
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2731
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2731
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: Bryan Duxbury
>         Attachments: split-v8.patch, split.patch
>
>
> When attempting to write to HBase as fast as possible, HBase accepts puts at 
> a reasonably high rate for a while, and then the rate begins to drop off, 
> ultimately culminating in exceptions reaching client code. In my testing, I 
> was able to write about 370 10KB records a second to HBase until I reach 
> around 1 million rows written. At that point, a moderate to large number of 
> exceptions - NotServingRegionException, WrongRegionException, region offline, 
> etc - begin reaching the client code. This appears to be because the 
> retry-and-wait logic in HTable runs out of retries and fails. 
> Looking at mapfiles for the regions from the command line shows that some of 
> the mapfiles are between 1 and 2 GB in size, much more than the stated file 
> size limit. Talking with Stack, one possible explanation for this is that the 
> RegionServer is not choosing to compact files often enough, leading to many 
> small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when 
> the time comes to do a split or "major" compaction, it takes an unexpectedly 
> long time to complete these operations. This translates into errors for the 
> client application.
> If I back off the import process and give the cluster some quiet time, some 
> splits and compactions clearly do take place, because the number of regions 
> go up and the number of mapfiles/region goes down. I can then begin writing 
> again in earnest for a short period of time until the problem begins again.
> Both Marc Harris and myself have seen this behavior.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to