[jira] Created: (HADOOP-2731) [hbase] Under extreme load, regions become extremely large and eventually cause region servers to become unresponsive

Bryan Duxbury (JIRA) Tue, 29 Jan 2008 13:55:04 -0800

[hbase] Under extreme load, regions become extremely large and eventually cause 
region servers to become unresponsive
---------------------------------------------------------------------------------------------------------------------


                 Key: HADOOP-2731
                 URL: https://issues.apache.org/jira/browse/HADOOP-2731
             Project: Hadoop Core
          Issue Type: Bug
          Components: contrib/hbase
            Reporter: Bryan Duxbury


When attempting to write to HBase as fast as possible, HBase accepts puts at a 
reasonably high rate for a while, and then the rate begins to drop off, 
ultimately culminating in exceptions reaching client code. In my testing, I was 
able to write about 370 10KB records a second to HBase until I reach around 1 
million rows written. At that point, a moderate to large number of exceptions - 
NotServingRegionException, WrongRegionException, region offline, etc - begin 
reaching the client code. This appears to be because the retry-and-wait logic 
in HTable runs out of retries and fails. 

Looking at mapfiles for the regions from the command line shows that some of 
the mapfiles are between 1 and 2 GB in size, much more than the stated file 
size limit. Talking with Stack, one possible explanation for this is that the 
RegionServer is not choosing to compact files often enough, leading to many 
small mapfiles, which in turn leads to a few overlarge mapfiles. Then, when the 
time comes to do a split or "major" compaction, it takes an unexpectedly long 
time to complete these operations. This translates into errors for the client 
application.

If I back off the import process and give the cluster some quiet time, some 
splits and compactions clearly do take place, because the number of regions go 
up and the number of mapfiles/region goes down. I can then begin writing again 
in earnest for a short period of time until the problem begins again.

Both Marc Harris and myself have seen this behavior.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HADOOP-2731) [hbase] Under extreme load, regions become extremely large and eventually cause region servers to become unresponsive

Reply via email to