Re: How to control the size of WAL logs

Jean-Daniel Cryans Mon, 21 Mar 2011 21:24:10 -0700

There's not really anything in hbase preventing you from having that
many regions, but usually for various reasons we try to keep it under
a few hundreds. Especially in the bulk uploading case, it has a huge
impact because of all the memstores a RS has to manage.


You can set the size for splitting by setting MAX_FILESIZE on your
table to at least 1GB (if you can give your region server a big heap
like 8-10GB, then you can set those regions even bigger).

J-D

On Mon, Mar 21, 2011 at 7:59 PM, 茅旭峰 <[email protected]> wrote:
> Thanks, J-D.
>
> No, we are not using any compressor.
>
> We have limited node for regionservers, so each of them holds thousands of
> regions, any guideline on this point?
>
> On Tue, Mar 22, 2011 at 10:30 AM, Jean-Daniel Cryans 
> <[email protected]>wrote:
>
>> HBase doesn't put a hard block on the number of hlogs like it does for
>> memstore size or store files to compact, so it seems you are able to
>> insert more data than you are flushing.
>>
>> Are you using GZ compression? This could be a cause for slow flushes.
>>
>> How many regions do you have per region server? Your log seems to
>> indicate that you have a ton of them.
>>
>> J-D
>>
>> On Mon, Mar 21, 2011 at 7:23 PM, 茅旭峰 <[email protected]> wrote:
>> > Regarding hbase.regionserver.maxlogs,
>> >
>> > I've set it to 2, but it turns out the number of files under /hbase/.logs
>> > stills keep increasing.
>> > I see lots of logs like
>> > ====
>> > 2011-03-22 00:00:07,156 DEBUG
>> > org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction
>> > requested for
>> >
>> table1,sZD5CTBLUdV55xWWkmkI5rb1mJM=,1300587568567.8a84acf58dd3d684ccaa47d4fb4fd53a.
>> > because regionserver60020.cacheFlusher; priority=-8, compaction queue
>> > size=1755
>> > 2011-03-22 00:00:07,183 INFO
>> > org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke
>> up
>> > with memory above low water.
>> > 2011-03-22 00:00:07,186 INFO
>> > org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Under global heap
>> > pressure: Region
>> >
>> table1,KGBJhl9RON29fT0hhak5-tBc-zs=,1300521656641.56e3a141164b546ae84d57e46a513922.
>> > has too many store files, but is 6.2m vs best flushable region's 2.1m.
>> > Choosing the bigger.
>> > 2011-03-22 00:00:07,186 INFO
>> > org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region
>> >
>> table1,KGBJhl9RON29fT0hhak5-tBc-zs=,1300521656641.56e3a141164b546ae84d57e46a513922.
>> > due to global heap pressure
>> > 2011-03-22 00:00:07,186 DEBUG
>> org.apache.hadoop.hbase.regionserver.HRegion:
>> > Started memstore flush for
>> >
>> table1,KGBJhl9RON29fT0hhak5-tBc-zs=,1300521656641.56e3a141164b546ae84d57e46a513922.,
>> > current region memstore size 6.2m
>> > 2011-03-22 00:00:07,201 INFO
>> > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: Using
>> syncFs
>> > -- HDFS-200
>> > 2011-03-22 00:00:07,241 INFO
>> org.apache.hadoop.hbase.regionserver.wal.HLog:
>> > Roll
>> > /hbase/.logs/cloud138,60020,1300712706331/cloud138%3A60020.1300723196796,
>> > entries=119, filesize=67903254. New hlog
>> > /hbase/.logs/cloud138,60020,1300712706331/cloud138%3A60020.1300723207156
>> > 2011-03-22 00:00:07,241 INFO
>> org.apache.hadoop.hbase.regionserver.wal.HLog:
>> > Too many hlogs: logs=398, maxlogs=2; forcing flush of 1 regions(s):
>> > 334c81997502eb3c66c2bb9b47a87bcc
>> > 2011-03-22 00:00:07,242 DEBUG
>> org.apache.hadoop.hbase.regionserver.HRegion:
>> > Finished snapshotting, commencing flushing stores
>> > 2011-03-22 00:00:07,577 INFO org.apache.hadoop.hbase.regionserver.Store:
>> > Renaming flushed file at
>> >
>> hdfs://cloud137:9000/hbase/table1/56e3a141164b546ae84d57e46a513922/.tmp/907665384208923152
>> > to
>> >
>> hdfs://cloud137:9000/hbase/table1/56e3a141164b546ae84d57e46a513922/cfEStore/2298819588481793315
>> > 2011-03-22 00:00:07,589 INFO org.apache.hadoop.hbase.regionserver.Store:
>> > Added
>> >
>> hdfs://cloud137:9000/hbase/table1/56e3a141164b546ae84d57e46a513922/cfEStore/2298819588481793315,
>> > entries=6, sequenceid=2229486, memsize=6.2m, filesize=6.2m
>> > 2011-03-22 00:00:07,591 INFO
>> org.apache.hadoop.hbase.regionserver.HRegion:
>> > Finished memstore flush of ~6.2m for region
>> >
>> table1,KGBJhl9RON29fT0hhak5-tBc-zs=,1300521656641.56e3a141164b546ae84d57e46a513922.
>> > in 405ms, sequenceid=2229486, compaction requested=true
>> > ====
>> >
>> > Does this mean we have too many request for the regionsever to catch up
>> with
>> > the hlogs' increasement?
>> >
>> > On the other hand, if there are too many files under /hbase/.logs, when I
>> > was trying to restart the master, there are
>> > over thousands of threads of class DataStreamer and ResponseProcessor,
>> which
>> > are trying to handle the hlogs.
>> > Then quickly, the master turns to OOME, any way to control this
>> situation?
>> >
>> > On Fri, Mar 18, 2011 at 12:20 AM, Jean-Daniel Cryans <
>> [email protected]>wrote:
>> >
>> >> You can limit the number of WALs and their size on the region server by
>> >> tuning:
>> >>
>> >> hbase.regionserver.maxlogs the default is 32
>> >> hbase.regionserver.hlog.blocksize the default is whatever your HDFS
>> >> blocksize times 0.95
>> >>
>> >> You can limit the number of parallel threads in the master by tuning:
>> >>
>> >> hbase.regionserver.hlog.splitlog.writer.threads the default is 3
>> >> hbase.regionserver.hlog.splitlog.buffersize the default is 1024*1024*!28
>> >>
>> >> J-D
>> >>
>> >> On Wed, Mar 16, 2011 at 11:57 PM, 茅旭峰 <[email protected]> wrote:
>> >> > Hi,
>> >> >
>> >> > In our tests, we've accumulated lots of WAL logs, in .logs, which
>> leads
>> >> to
>> >> > quite long time pause or even
>> >> > OOME when restarting either master or region server. We're doing sort
>> of
>> >> > bulk import and have not using
>> >> > bulk import tricks, like turning off WAL feature. We think it's
>> unknown
>> >> how
>> >> > our application really use the
>> >> > hbase, so it is possible that users doing batch import unless we're
>> >> running
>> >> > out of space. I wonder if there
>> >> > is any property to set to control the size of WAL, would setting
>> smaller
>> >> > 'hbase.regionserver.logroll.period'
>> >> > help?
>> >> >
>> >> > On the other hand, since we have lots of regions, the master is easy
>> to
>> >> run
>> >> > into OOME, due to the occupied
>> >> > memory by the instance of Assignment.regions. When we were trying to
>> >> restart
>> >> > the master, it always died
>> >> > with OOME. I think, from the hprof file,  it is because the instance
>> of
>> >> > HLogSplitter$OutputSink holds too many
>> >> > HLogSplitter$WriterAndPaths in logWriters, which even hold the buffer
>> of
>> >> > wal.SequenceFileLogWriter.
>> >> > Is there any trick to avoid such kind of scenario?
>> >> >
>> >> > Thanks and regards,
>> >> >
>> >> > Mao Xu-Feng
>> >> >
>> >>
>> >
>>
>

Re: How to control the size of WAL logs

Reply via email to