[
https://issues.apache.org/jira/browse/HBASE-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12656540#action_12656540
]
Andrew Purtell commented on HBASE-1062:
---------------------------------------
Our 25 node cluster layout is:
1: namenode, datanode
2: datanode, hmaster, jobtracker
3-25: datanode, regionserver, tasktracker
We run datanodes everywhere because each node has 2.5TB of storage that we'd
clearly like to include in the DFS volume.
Tasktrackers do not run on the semi-dedicated namenode node nor the
semi-dedicated hmaster node. There is a HRS running alongside every TT. Each TT
is configured to allow only four concurrent tasks -- 2 mappers and/or 2
reducers. Some of our tasks can be heavy, running with 1G heap, etc. Especially
the document parser really loads CPU, RAM, and DFS while the mappers crunch
away.
Right now our average load is around 50 also.
> Compactions at (re)start on a large table can overwhelm DFS
> -----------------------------------------------------------
>
> Key: HBASE-1062
> URL: https://issues.apache.org/jira/browse/HBASE-1062
> Project: Hadoop HBase
> Issue Type: Bug
> Components: regionserver
> Reporter: Andrew Purtell
> Assignee: Andrew Purtell
> Priority: Critical
> Fix For: 0.20.0
>
>
> Given a large table, > 1000 regions for example, if a cluster restart is
> necessary, the compactions undertaken by the regionservers when the master
> makes initial region assignments can overwhelm DFS, leading to file errors
> and data loss. This condition is exacerbated if write load was heavy before
> restart and so many regions want to split as soon as they are opened.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.