Haven't tried this. I'll give it a shot. Thanks
On Thursday, October 9, 2014, Ted Yu <yuzhih...@gmail.com> wrote: > Looks like the number of regions is lower than the number of nodes in the > cluster. > > Can you split the table such that, after hbase balancer is run, there is > region hosted by every node ? > > Cheers > > On Oct 8, 2014, at 11:01 PM, SF Hadoop <sfhad...@gmail.com <javascript:;>> > wrote: > > > I'm not sure if this is an HBase issue or an Hadoop issue so if this is > "off-topic" please forgive. > > > > I am having a problem with Hadoop maxing out drive space on a select few > nodes when I am running an HBase job. The scenario is this: > > > > - The job is a data import using Map/Reduce / HBase > > - The data is being imported to one table > > - The table only has a couple of regions > > - As the job runs, HBase? / Hadoop? begins placing the data in HDFS on > the datanode / regionserver that is hosting the regions > > - As the job progresses (and more data is imported) the two datanodes > hosting the regions start to get full and eventually drive space hits 100% > utilization whilst the other nodes in the cluster are at 40% or less drive > space utilization > > - The job in Hadoop then begins to hang with multiple "out of space" > errors and eventually fails. > > > > I have tried running hadoop balancer during the job run and this helped > but only really succeeded in prolonging the eventual job failure. > > > > How can I get Hadoop / HBase to distribute the data to HDFS more evenly > when it is favoring the nodes that the regions are on? > > > > Am I missing something here? > > > > Thanks for any help. >