[ 
https://issues.apache.org/jira/browse/HBASE-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-1062:
----------------------------------

      Component/s: regionserver
         Priority: Critical  (was: Major)
    Fix Version/s: 0.20.0
         Assignee: Andrew Purtell

Made this a critical issue for 0.20.0, and took ownership. I think I am done 
with this issue for now :-), until the patch anyhow. 

Of course the proposed solution will impact cluster (re)start time by 
lengthening it, perhaps substantially, but I think that is better than data 
loss. 

A better solution might be to interrogate DFS for its current load before 
attempting any action that might load it, but definitely the Hadoop filesystem 
abstraction layer does not support this and the DFS protocols do not either. 
What do you think about filing an issue about this upstream?

> Compactions at (re)start on a large table can overwhelm DFS
> -----------------------------------------------------------
>
>                 Key: HBASE-1062
>                 URL: https://issues.apache.org/jira/browse/HBASE-1062
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Critical
>             Fix For: 0.20.0
>
>
> Given a large table, > 1000 regions for example, if a cluster restart is 
> necessary, the compactions undertaken by the regionservers when the master 
> makes initial region assignments can overwhelm DFS, leading to file errors 
> and data loss. This condition is exacerbated if write load was heavy before 
> restart and so many regions want to split as soon as they are opened. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to