overloading specific nodes

SF Hadoop Thu, 09 Oct 2014 02:08:23 -0700

Haven't tried this. I'll give it a shot.

Thanks


On Thursday, October 9, 2014, Ted Yu <yuzhih...@gmail.com> wrote:

> Looks like the number of regions is lower than the number of nodes in the
> cluster.
>
> Can you split the table such that, after hbase balancer is run, there is
> region hosted by every node ?
>
> Cheers
>
> On Oct 8, 2014, at 11:01 PM, SF Hadoop <sfhad...@gmail.com <javascript:;>>
> wrote:
>
> > I'm not sure if this is an HBase issue or an Hadoop issue so if this is
> "off-topic" please forgive.
> >
> > I am having a problem with Hadoop maxing out drive space on a select few
> nodes when I am running an HBase job.  The scenario is this:
> >
> > - The job is a data import using Map/Reduce / HBase
> > - The data is being imported to one table
> > - The table only has a couple of regions
> > - As the job runs, HBase? / Hadoop? begins placing the data in HDFS on
> the datanode / regionserver that is hosting  the regions
> > - As the job progresses (and more data is imported) the two datanodes
> hosting the regions start to get full and eventually drive space hits 100%
> utilization whilst the other nodes in the cluster are at 40% or less drive
> space utilization
> > - The job in Hadoop then begins to hang with multiple "out of space"
> errors and eventually fails.
> >
> > I have tried running hadoop balancer during the job run and this helped
> but only really succeeded in prolonging the eventual job failure.
> >
> > How can I get Hadoop / HBase to distribute the data to HDFS more evenly
> when it is favoring the nodes that the regions are on?
> >
> > Am I missing something here?
> >
> > Thanks for any help.
>

Re: Hadoop / HBase hotspotting / overloading specific nodes

Reply via email to