[ 
https://issues.apache.org/jira/browse/HBASE-9556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14006999#comment-14006999
 ] 

Anoop Sam John commented on HBASE-9556:
---------------------------------------

Related issue HBASE-4063

> Provide key range support to bulkload to avoid too many reducers even the 
> data belongs to few regions
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-9556
>                 URL: https://issues.apache.org/jira/browse/HBASE-9556
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: rajeshbabu
>            Assignee: rajeshbabu
>            Priority: Minor
>
> Presently the number of reducers in bulk load are equal to number of regions.
> Lets suppose a table has 500 regions and import data only belongs 10 regions, 
> still we are starting 500(equal to no. of regions) reducers instead of 10. 
> Which will consume more time and resources. 
> If user knows the row key range of import data, then we can pass startkey 
> and/or endkey as input and based on the key range we can define the 
> partitions and number of reducers(regions to which the data belongs). This 
> helps to avoid too many reducers to start and do nothing and also avoids 
> contention in shuffling.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to