Yu Li created HBASE-10932:
-----------------------------

             Summary: Improve RowCounter to allow mapper number set/control
                 Key: HBASE-10932
                 URL: https://issues.apache.org/jira/browse/HBASE-10932
             Project: HBase
          Issue Type: Improvement
          Components: mapreduce
            Reporter: Yu Li
            Assignee: Yu Li
            Priority: Minor


The typical use case of RowCounter is to do some kind of data integrity 
checking, like after exporting some data from RDBMS to HBase, or from one HBase 
cluster to another, making sure the row(record) number matches. Such check 
commonly won't require much on response time.
Meanwhile, based on current impl, RowCounter will launch one mapper per region, 
and each mapper will send one scan request. Assuming the table is kind of big 
like having tens of regions, and the cpu core number of the whole MR cluster is 
also enough, the parallel scan requests sent by mapper would be a real burden 
for the HBase cluster.
So in this JIRA, we're proposing to make rowcounter support an additional 
option "--maps" to specify mapper number, and make each mapper able to scan 
more than one region of the target table.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to