[ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181421#comment-16181421
 ] 

Ashu Pachauri commented on HBASE-18090:
---------------------------------------

[~xinxin fan] Thanks for patch. I have left some minor comments on the review 
board. Other than that, the patch looks good.

Also, I think the same change to be able to pass numSplitsPerRegion is 
applicable not just for snapshots but for tables (and thus to TableInputFormat) 
also. How much work does it involve to add support for this config in 
TableInputFormat? May be we can just delegate this to MultiTableInputFormat. 
That said, I think we can get this change in for snapshots and file a follow up 
issue to add support for tables. 


> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --------------------------------------------------------------------------
>
>                 Key: HBASE-18090
>                 URL: https://issues.apache.org/jira/browse/HBASE-18090
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>    Affects Versions: 1.4.0
>            Reporter: Mikhail Antonov
>            Assignee: xinxin fan
>         Attachments: HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to