[ 
https://issues.apache.org/jira/browse/HBASE-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15366514#comment-15366514
 ] 

Thiruvel Thirumoolan commented on HBASE-16169:
----------------------------------------------

[~eclark], This aligns with the general principle that master should be out of 
the read/write path. In this case, we use the Master API as part of split 
calculation.

We also believe this is the right direction for us internally since this 
approach scales well. ClusterStatus gets bulkier and bulkier as the cluster and 
tables grow. It also doesn't help that most of the information from 
ClusterStatus is thrown away, since it doesn't even belong to the table we are 
interested in.

> RegionSizeCalculator should not depend on master
> ------------------------------------------------
>
>                 Key: HBASE-16169
>                 URL: https://issues.apache.org/jira/browse/HBASE-16169
>             Project: HBase
>          Issue Type: Sub-task
>          Components: mapreduce, scaling
>            Reporter: Thiruvel Thirumoolan
>            Assignee: Thiruvel Thirumoolan
>             Fix For: 2.0.0, 1.4.0
>
>         Attachments: HBASE-16169.master.000.patch
>
>
> RegionSizeCalculator is needed for better split generation of MR jobs. This 
> requires RegionLoad which can be obtained via ClusterStatus, i.e. accessing 
> Master. We don't want master to be in this path.
> The proposal is to add an API to the RegionServer that gets RegionLoad of all 
> regions hosted on it or those of a table if specified. RegionSizeCalculator 
> can use the latter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to