Dave Shine ,
    Can you share how many data is been taken by map task .If map task is
uneven then it might be Hot Spotting Problem.
Have an look on
http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
 .
  I had also faced same problem i am trying implement this HbaseWD.

            Thanks and Regards,
        S SYED ABDUL KATHER


*
*

On Fri, Jul 20, 2012 at 6:50 PM, Dave Shine <
dave.sh...@channelintelligence.com> wrote:

>  I have a job that is emitting over 3 billion rows from the map to the
> reduce.  The job is configured with 43 reduce tasks.  A perfectly even
> distribution would amount to about 70 million rows per reduce task.
> However I actually got around 60 million for most of the tasks, one task
> got over 100 million, and one task got almost 350 million.  This uneven
> distribution caused the job to run exceedingly long.****
>
> ** **
>
> I believe this is referred to as a “key skew problem”, which I know is
> heavily dependent on the actual data being processed.  Can anyone point me
> to any blog posts, white papers, etc. that might give me some options on
> how to deal with this issue? ****
>
> ** **
>
> Thanks,****
>
> *Dave Shine*****
>
> Sr. Software Engineer****
>
> 321.939.5093 direct |  407.314.0122 mobile****
>
> ** **
>
> [image: cid:D34AFA33-EA7B-4B08-9DD4-2C8DFBE66338]****
>
> *CI Boost™ Clients*  *Outperform Online™  *www.ciboost.com****
>
> facebook platform | where-to-buy | product search engines | shopping
> engines****
>
> ** **
>
> ** **
>
> ------------------------------
> The information contained in this email message is considered confidential
> and proprietary to the sender and is intended solely for review and use by
> the named recipient. Any unauthorized review, use or distribution is
> strictly prohibited. If you have received this message in error, please
> advise the sender by reply email and delete the message.
>

<<image001.png>>

Reply via email to