[
https://issues.apache.org/jira/browse/HADOOP-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12563869#action_12563869
]
stack commented on HADOOP-2731:
-------------------------------
HADOOP-2636 doesn't seem to help with this problem in particuar.
Running 8 clients doing PerformanceEvaluation, what I'm looking for is steady
number of files to compact on each run.
Here are first four compactions before the patch:
{code}
2008-01-30 07:03:06,803 DEBUG org.apache.hadoop.hbase.HStore: started
compaction of 3 files using hdfs://12.:9123/hbase123/TestTable/compaction.dir
for 530893190/info
2008-01-30 07:04:25,345 DEBUG org.apache.hadoop.hbase.HStore: started
compaction of 3 files using hdfs://12.:9123/hbase123/TestTable/compaction.dir
for 530893190/info
2008-01-30 07:06:35,573 DEBUG org.apache.hadoop.hbase.HStore: started
compaction of 3 files using hdfs://12.:9123/hbase123/TestTable/compaction.dir
for 530893190/info
2008-01-30 07:11:14,999 DEBUG org.apache.hadoop.hbase.HStore: started
compaction of 9 files using hdfs://12.:9123/hbase123/TestTable/compaction.dir
for 560724365/info
{code}
A split ran in between the 3rd and 4th compaction.
Here are first four compactions after application of patch:
{code}
2008-01-30 06:43:17,972 DEBUG org.apache.hadoop.hbase.HStore: started
compaction of 4 files using hdfs://12.:9123/hbase123/TestTable/compaction.dir
for 1984834473/info
2008-01-30 06:44:54,734 DEBUG org.apache.hadoop.hbase.HStore: started
compaction of 3 files using hdfs://12.:9123/hbase123/TestTable/compaction.dir
for 1984834473/info
2008-01-30 06:48:53,389 DEBUG org.apache.hadoop.hbase.HStore: started
compaction of 7 files using hdfs://12.:9123/hbase123/TestTable/compaction.dir
for 712183868/info
2008-01-30 06:53:25,746 DEBUG org.apache.hadoop.hbase.HStore: started
compaction of 9 files using hdfs://12.:9123/hbase123/TestTable/compaction.dir
for 712183868/info
{code}
> [hbase] Under extreme load, regions become extremely large and eventually
> cause region servers to become unresponsive
> ---------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-2731
> URL: https://issues.apache.org/jira/browse/HADOOP-2731
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/hbase
> Reporter: Bryan Duxbury
> Attachments: split.patch
>
>
> When attempting to write to HBase as fast as possible, HBase accepts puts at
> a reasonably high rate for a while, and then the rate begins to drop off,
> ultimately culminating in exceptions reaching client code. In my testing, I
> was able to write about 370 10KB records a second to HBase until I reach
> around 1 million rows written. At that point, a moderate to large number of
> exceptions - NotServingRegionException, WrongRegionException, region offline,
> etc - begin reaching the client code. This appears to be because the
> retry-and-wait logic in HTable runs out of retries and fails.
> Looking at mapfiles for the regions from the command line shows that some of
> the mapfiles are between 1 and 2 GB in size, much more than the stated file
> size limit. Talking with Stack, one possible explanation for this is that the
> RegionServer is not choosing to compact files often enough, leading to many
> small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when
> the time comes to do a split or "major" compaction, it takes an unexpectedly
> long time to complete these operations. This translates into errors for the
> client application.
> If I back off the import process and give the cluster some quiet time, some
> splits and compactions clearly do take place, because the number of regions
> go up and the number of mapfiles/region goes down. I can then begin writing
> again in earnest for a short period of time until the problem begins again.
> Both Marc Harris and myself have seen this behavior.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.