[ 
https://issues.apache.org/jira/browse/HBASE-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Gray resolved HBASE-1172.
----------------------------------

    Resolution: Duplicate

Included as part of TIF revamp in HBASE-1385

> Modify TableInputFormat splitting algorithm to allow any number of mappers
> --------------------------------------------------------------------------
>
>                 Key: HBASE-1172
>                 URL: https://issues.apache.org/jira/browse/HBASE-1172
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Jonathan Gray
>            Assignee: stack
>            Priority: Minor
>             Fix For: 0.20.0
>
>
> Currently, the number of mappers specified when using TableInputFormat is 
> strictly followed if less than total regions on the input table.  If greater, 
> the number of regions is used.
> This will modify the splitting algorithm to do the following:
> - Specify 0 mappers when you want # mappers = # regions
> - If you specify fewer mappers than regions, will use exactly the number you 
> specify based on the current algorithm
> - If you specify more mappers than regions, will divide regions up by 
> determining [start,X) [X,end).  The number of mappers will always be a 
> multiple of number of regions.  This is so we do not have scanners spanning 
> multiple regions.
> There is an additional issue in that the default number of mappers in JobConf 
> is set to 1.  That means if a user does not explicitly set number of map 
> tasks, a single mapper will be used.  I'm going to deal with that in a 
> separate jira as the issue currently exists, there are a number of ways to 
> implement this, and it's not required to complete this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to