[ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
-------------------------------
    Description: 
TableSnapshotInputFormat runs one map task per region in the table snapshot. 
This places unnecessary restriction that the region layout of the original 
table needs to take the processing resources available to MR job into 
consideration. Allowing to run multiple mappers per region (assuming reasonably 
even key distribution) would be useful.

With this feature, client can specify the desired num of mappers when init 
table snapshot mapper job:

{code}
TableMapReduceUtil.initTableSnapshotMapperJob(snapshotName,)
{code}


  was:
TableSnapshotInputFormat runs one map task per region in the table snapshot. 
This places unnecessary restriction that the region layout of the original 
table needs to take the processing resources available to MR job into 
consideration. Allowing to run multiple mappers per region (assuming reasonably 
even key distribution) would be useful.




> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --------------------------------------------------------------------------
>
>                 Key: HBASE-18090
>                 URL: https://issues.apache.org/jira/browse/HBASE-18090
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Mikhail Antonov
>            Assignee: xinxin fan
>             Fix For: 2.0.0-beta-1
>
>         Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1-v2.patch, 
> HBASE-18090-branch-1.3-v1.patch, HBASE-18090-branch-1.3-v2.patch, 
> HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.
> With this feature, client can specify the desired num of mappers when init 
> table snapshot mapper job:
> {code}
> TableMapReduceUtil.initTableSnapshotMapperJob(snapshotName,)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to