Re: Doubt in HBase

Amandeep Khurana Thu, 20 Aug 2009 11:35:36 -0700

On Thu, Aug 20, 2009 at 9:42 AM, john smith <[email protected]> wrote:


> Hi all ,
>
> I have one small doubt . Kindly answer it even if it sounds silly.
>

No questions are silly.. Dont worry


>
> Iam using Map Reduce in HBase in distributed mode .  I have a table which
> spans across 5 region servers . I am using TableInputFormat to read the
> data
> from the tables in the map . When i run the program , by default how many
> map regions are created ? Is it one per region server or more ?
>

If you set the number of map tasks to a high number, it automatically spawns
one map task for each region (not region server). Otherwise, it'll spawn the
number you have explicitly specified in the job.


>
> Also after the map task is over.. reduce task is taking a bit more time .
> Is
> it due to moving the map output across the regionservers? i.e, moving the
> values of same key to a particular reduce phase to start the reducer? Is
> there any way i can optimize the code (e.g. by storing data of same reducer
> nearby )
>

Increase the number of reducers. Each reducer will have lesser data to move.


>
> Thanks :)
>

Re: Doubt in HBase

Reply via email to