I just looked at the source code for HBaseStorage. It uses a modified
version of TableInputFormat under the hood. TableInputFormat, AFAIK, does
not support controlling the number of launched Map tasks. It might be a
worthwhile contribution to HBase to write an analogous version of a
CombineInputFormat, so a single Map task can read multiple regions.


On Wed, May 21, 2014 at 10:21 AM, Hansi Klose <[email protected]> wrote:

> Hi Lei,
>
> I don't know if that helps you, I had the same problem with the
> replication verify jobs I
> run in our environment.
>
> I created a fairscheduler pool on the jobtracker called "admin" and
> configured
> this pool with the maximum mappers the job should take.
>
> I inserted in my hbase-site.xml this section
>
>   <property>
>     <name>mapred.queue.name</name>
>     <value>admin</value>
>   </property>
>   <property>
>
> You need to insert this only on the node you start the job.
>
> Then I login as user "hbase" on that machine with the configuration.
>
> When i run my verify jobs as user "hbase" the job will go to the
> fairscheduler pool
> "admin" and will take only the allowed count of mappers.
>
> Before i took all mapper i could get.
>
> Regards Hansi
>
> > Gesendet: Mittwoch, 21. Mai 2014 um 04:16 Uhr
> > Von: "[email protected]" <[email protected]>
> > An: user <[email protected]>, user <[email protected]>
> > Betreff: How to set number of mappers when using HBaseStorage
> >
> >
> > When using HBaseStorage to read data from hbase table, there will be one
> mapper for one region.
> > Howerver, my hbase table has more than 1000 regions and only 80 mappers
> capacity.
> > Is there a way to set the number of mappers when using HBaseStorage?
> >
> > Thanks,
> > Lei
> >
> >
> >
> > [email protected]
> >
>

Reply via email to