[jira] Commented: (HBASE-1062) Compactions at (re)start on a large table can overwhelm DFS

stack (JIRA) Wed, 17 Dec 2008 15:21:08 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657610#action_12657610
 ]


stack commented on HBASE-1062:
------------------------------

A few comments on the patch Andrew:

+ Is it wise postponing memcache flushes?  Even if its only for the 2 minutes 
of HRS safe mode?  We can take on updates during this time?  If so, could we 
OOME if rabid uploading afoot?
+ We schedule compactions on open and on flush.  This would put off the open 
scheduling for interval of 2 minutes.  If cluster went down ugly, and some 
regions had References outstanding, then these regions would not be splittable, 
not until a memcache flush ran; i.e. it took on a bunch of uploads.   Maybe 
thats OK?
+ Do we ever break out of this loop:

{code}
+        if ((limit > 0) && (++count > limit)) {
+          try {
+            Thread.sleep(this.frequency);
+          } catch (InterruptedException ex) {
+            continue;
+          }
+          count = 0;
+        }
{code}

Looks like we increment count then set it to zero after sleep. It never 
progresses?

> Compactions at (re)start on a large table can overwhelm DFS
> -----------------------------------------------------------
>
>                 Key: HBASE-1062
>                 URL: https://issues.apache.org/jira/browse/HBASE-1062
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Critical
>             Fix For: 0.20.0
>
>         Attachments: 1062-1.patch
>
>
> Given a large table, > 1000 regions for example, if a cluster restart is 
> necessary, the compactions undertaken by the regionservers when the master 
> makes initial region assignments can overwhelm DFS, leading to file errors 
> and data loss. This condition is exacerbated if write load was heavy before 
> restart and so many regions want to split as soon as they are opened. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1062) Compactions at (re)start on a large table can overwhelm DFS

Reply via email to