[
https://issues.apache.org/jira/browse/HBASE-10932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962998#comment-13962998
]
Yu Li commented on HBASE-10932:
-------------------------------
In the implementation, we will check and make sure the map number set to be
smaller than region number of the target table. And if the map number larger
than region number, it will go in the old way, or say one mapper per region.
Will attach the patch soon.
> Improve RowCounter to allow mapper number set/control
> -----------------------------------------------------
>
> Key: HBASE-10932
> URL: https://issues.apache.org/jira/browse/HBASE-10932
> Project: HBase
> Issue Type: Improvement
> Components: mapreduce
> Reporter: Yu Li
> Assignee: Yu Li
> Priority: Minor
>
> The typical use case of RowCounter is to do some kind of data integrity
> checking, like after exporting some data from RDBMS to HBase, or from one
> HBase cluster to another, making sure the row(record) number matches. Such
> check commonly won't require much on response time.
> Meanwhile, based on current impl, RowCounter will launch one mapper per
> region, and each mapper will send one scan request. Assuming the table is
> kind of big like having tens of regions, and the cpu core number of the whole
> MR cluster is also enough, the parallel scan requests sent by mapper would be
> a real burden for the HBase cluster.
> So in this JIRA, we're proposing to make rowcounter support an additional
> option "--maps" to specify mapper number, and make each mapper able to scan
> more than one region of the target table.
--
This message was sent by Atlassian JIRA
(v6.2#6252)