great advice guys. appreciate it. Have made the changes to increase storefile size. I'd also like to prevent rebalancing while I am running my large M/R Put job. Any way to do that?
At present, 50% of the time that I run my large M/R Put job, the table is corrupted (hole in .META.) and we have to run our repair program to fix the hole. It's very labor intensive. I am hoping that be turning off splitting, and deferring balancing, that I can prevent whatever condition leads to the creation of the hole in .META.. My hope is that if we prevent splitting and rebalancing then there would be no action that could cause a whole to occur. -geoff -----Original Message----- From: Doug Meil [mailto:[email protected]] Sent: Sunday, September 04, 2011 9:12 AM To: [email protected] Cc: [email protected] Subject: Re: prevent region splits? Along with what Jack said, see this... http://hbase.apache.org/book.html#required_configuration .. and just double check that you don't have scheduled major compactions going off once a day (the default) On 9/3/11 7:54 PM, "Jack Levin" <[email protected]> wrote: >Make hbase.hregion.max.filesize to be very large. Then your regions >won't split. We use this method when copying 'live' hbase to make a >backup. > >-Jack > >On Sat, Sep 3, 2011 at 4:32 PM, Geoff Hendrey <[email protected]> >wrote: >> Is there a way to prevent regions from splitting while we are running a >> mapreduce job that does a lot of Puts? It seems that there is a lot of >> HDFS activity related to the splitting of regions while my M/R job is >> doing the puts. Is it sensible to disable splitting during the job that >> does lots of Put? Would there be any danger in this (i.e. disabling >> splitting during the job, and re-enabling it when the job completes)? >> >> >> >> I see the hbase.regionserver.thread.splitcompactcheckfrequency could be >> used to make splits happen less frequently, but what I'd really like is >> for splitting to be disabled, then re-enabled later. >> >> >> >> -Geoff >> >>
