[
https://issues.apache.org/jira/browse/HBASE-1385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12725614#action_12725614
]
Lars George commented on HBASE-1385:
------------------------------------
Now, the last issue is HBASE-1172, which was merged in here. The new framework
basically does away with setNumMapTasks() altogether as it was always just a
hint to the framework as to how many mappers should run. The actual number is
driven by the number of splits, in our case what TableInputFormat.getSplits()
returns. So we are now always setting it to the number of regions as per
previous discussions. Are you still wanting to be able to override it?
We could by getting context.getConfiguration.getInt("mapred.map.tasks") - but
we would also have to manually set that before starting the process because of
the setter now missing as per the above. Or set it with a constant defined in
TIF, like the table name etc.
But what is the advantage?
Apart from this, the patch should be complete after your review.
> Revamp TableInputFormat, needs updating to match hadoop 0.20.x AND remove bit
> where we can make < maps than regions
> -------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-1385
> URL: https://issues.apache.org/jira/browse/HBASE-1385
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: stack
> Fix For: 0.21.0
>
> Attachments: 1385-v5.patch, 1385-v6.patch, 1385-v7.patch,
> 1385-v8.patch, mr.patch
>
>
> Update TIF to match new MR.
> Remove the bit of logic where we will use number of configured maps as splits
> count rather than regions.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.