[ 
https://issues.apache.org/jira/browse/HADOOP-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Kellerman updated HADOOP-1644:
----------------------------------

        Fix Version/s: 0.15.0
             Priority: Major  (was: Minor)
           Issue Type: Improvement  (was: Wish)
    Affects Version/s: 0.15.0
              Summary: [hbase] Compactions should not block updates  (was: 
[hbase] Compactions should take no longer than period between memcache flushes)

> [hbase] Compactions should not block updates
> --------------------------------------------
>
>                 Key: HADOOP-1644
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1644
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>    Affects Versions: 0.15.0
>            Reporter: stack
>            Assignee: stack
>             Fix For: 0.15.0
>
>
> Currently, compactions take a long time.  During compaction, updates are 
> carried by the HRegions' memcache (+ backing HLog).  memcache is unable to 
> flush to disk until compaction completes.
> Under sustained, substantial --  rows that contain multiple columns one of 
> which is a web page -- updates by multiple concurrent clients (10 in this 
> case), a common hbase usage scenario, the memcache grows fast and often to 
> orders of magnitude in excess of the configured 'flush-to-disk' threshold.
> This throws the whole system out of kilter.  When memcache does get to run 
> after compaction completes -- assuming you have sufficent RAM and the region 
> server doesn't OOME -- then the resulting on-disk file will be way larger 
> than any other on-disk HStoreFile bringing on a region split ..... but the 
> resulting split will produce regions that themselves need to be immediately 
> split because each half is beyond the configured limit, and so on...
> In another issue yet to be posted, tuning and some pointed memcache flushes 
> makes the above condition less extreme but until compaction durations come 
> close to the memcache flush threshold compactions will remain disruptive. 
> Its allowed that compactions may never be fast enough as per bigtable paper 
> (This is a 'wish' issue).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to