[ 
https://issues.apache.org/jira/browse/HBASE-4755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13589576#comment-13589576
 ] 

Jonathan Hsieh commented on HBASE-4755:
---------------------------------------

bq. I think I should write a spec up and we can continue the discussion on that 
spec as the base.

+1.  Especially since this feature touches multiple components (hdfs, region 
creation, table creation, region balancer).

bq. Yes (in particular, I am currently only considering pre-split tables, and 
the createTable call in the master allocates the regions to particular 
datanodes). The regionservers would use this information (as the favored nodes) 
for creating any new file on the hdfs.

I'm assuming the first cut would use "random" assignment for 2nd and 3rd 
replicas?  

Handling "natural" splits would be follow on jira? (a simple policy for splits 
is for the daughters to choose the same dns as the parent, something slightly 
smarter would have the parent as one of the replicas but pick 2 new nodes with 
higher priority).

bq. They could fight, but the hdfs balancer could in theory be tweaked to not 
move blocks for certain paths. The hdfs balancer needs to be run manually.

Make sense, but sounds like an argument for adding some persistent attribute 
state in hdfs. (or have hdfs consult hbase.. yuck).

bq. The tool would look at the regions, their locality information, and try to 
make sure the map from regions to favored nodes is optimal. It might reassign 
regions in this process (i.e., update the meta table with the new information, 
that would then be propagated to the regionservers). The regionservers, like 
before, would use this information (as the favored nodes) for creating any new 
file on the hdfs.

So this tool sounds like a new balancer that might fight with the built in 
hbase balancer.  This makes me want to make the hbase balancer external from 
the master. :)

                
> HBase based block placement in DFS
> ----------------------------------
>
>                 Key: HBASE-4755
>                 URL: https://issues.apache.org/jira/browse/HBASE-4755
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 0.94.0
>            Reporter: Karthik Ranganathan
>            Assignee: Christopher Gist
>            Priority: Critical
>         Attachments: 4755-wip-1.patch
>
>
> The feature as is only useful for HBase clusters that care about data 
> locality on regionservers, but this feature can also enable a lot of nice 
> features down the road.
> The basic idea is as follows: instead of letting HDFS determine where to 
> replicate data (r=3) by place blocks on various regions, it is better to let 
> HBase do so by providing hints to HDFS through the DFS client. That way 
> instead of replicating data at a blocks level, we can replicate data at a 
> per-region level (each region owned by a promary, a secondary and a tertiary 
> regionserver). This is better for 2 things:
> - Can make region failover faster on clusters which benefit from data affinity
> - On large clusters with random block placement policy, this helps reduce the 
> probability of data loss
> The algo is as follows:
> - Each region in META will have 3 columns which are the preferred 
> regionservers for that region (primary, secondary and tertiary)
> - Preferred assignment can be controlled by a config knob
> - Upon cluster start, HMaster will enter a mapping from each region to 3 
> regionservers (random hash, could use current locality, etc)
> - The load balancer would assign out regions preferring region assignments to 
> primary over secondary over tertiary over any other node
> - Periodically (say weekly, configurable) the HMaster would run a locality 
> checked and make sure the map it has for region to regionservers is optimal.
> Down the road, this can be enhanced to control region placement in the 
> following cases:
> - Mixed hardware SKU where some regionservers can hold fewer regions
> - Load balancing across tables where we dont want multiple regions of a table 
> to get assigned to the same regionservers
> - Multi-tenancy, where we can restrict the assignment of the regions of some 
> table to a subset of regionservers, so an abusive app cannot take down the 
> whole HBase cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to