[
https://issues.apache.org/jira/browse/HBASE-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jonathan Gray resolved HBASE-1172.
----------------------------------
Resolution: Duplicate
Included as part of TIF revamp in HBASE-1385
> Modify TableInputFormat splitting algorithm to allow any number of mappers
> --------------------------------------------------------------------------
>
> Key: HBASE-1172
> URL: https://issues.apache.org/jira/browse/HBASE-1172
> Project: Hadoop HBase
> Issue Type: Improvement
> Components: mapred
> Reporter: Jonathan Gray
> Assignee: stack
> Priority: Minor
> Fix For: 0.20.0
>
>
> Currently, the number of mappers specified when using TableInputFormat is
> strictly followed if less than total regions on the input table. If greater,
> the number of regions is used.
> This will modify the splitting algorithm to do the following:
> - Specify 0 mappers when you want # mappers = # regions
> - If you specify fewer mappers than regions, will use exactly the number you
> specify based on the current algorithm
> - If you specify more mappers than regions, will divide regions up by
> determining [start,X) [X,end). The number of mappers will always be a
> multiple of number of regions. This is so we do not have scanners spanning
> multiple regions.
> There is an additional issue in that the default number of mappers in JobConf
> is set to 1. That means if a user does not explicitly set number of map
> tasks, a single mapper will be used. I'm going to deal with that in a
> separate jira as the issue currently exists, there are a number of ways to
> implement this, and it's not required to complete this issue.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.