great advice guys. appreciate it. Have made the changes to increase
storefile size. I'd also like to prevent rebalancing while I am running
my large M/R Put job. Any way to do that?

At present, 50% of the time that I run my large M/R Put job, the table
is corrupted (hole in .META.) and we have to run our repair program to
fix the hole. It's very labor intensive. I am hoping that be turning off
splitting, and deferring balancing, that I can prevent whatever
condition leads to the creation of the hole in .META.. My hope is that
if we prevent splitting and rebalancing then there would be no action
that could cause a whole to occur.

-geoff

-----Original Message-----
From: Doug Meil [mailto:[email protected]] 
Sent: Sunday, September 04, 2011 9:12 AM
To: [email protected]
Cc: [email protected]
Subject: Re: prevent region splits?


Along with what Jack said, see this...

http://hbase.apache.org/book.html#required_configuration

.. and just double check that you don't have scheduled major compactions
going off once a day (the default)



On 9/3/11 7:54 PM, "Jack Levin" <[email protected]> wrote:

>Make hbase.hregion.max.filesize to be very large. Then your regions
>won't split.  We use this method when copying 'live' hbase to make a
>backup.
>
>-Jack
>
>On Sat, Sep 3, 2011 at 4:32 PM, Geoff Hendrey <[email protected]>
>wrote:
>> Is there a way to prevent regions from splitting while we are running
a
>> mapreduce job that does a lot of Puts? It seems that there is a lot
of
>> HDFS activity related to the splitting of regions while my M/R job is
>> doing the puts. Is it sensible to disable splitting during the job
that
>> does lots of Put? Would there be any danger in this (i.e. disabling
>> splitting during the job, and re-enabling it when the job completes)?
>>
>>
>>
>> I see the hbase.regionserver.thread.splitcompactcheckfrequency could
be
>> used to make splits happen less frequently, but what I'd really like
is
>> for splitting to be disabled, then re-enabled later.
>>
>>
>>
>> -Geoff
>>
>>

Reply via email to