Regarding hbase.regionserver.maxlogs, I've set it to 2, but it turns out the number of files under /hbase/.logs stills keep increasing. I see lots of logs like ==== 2011-03-22 00:00:07,156 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for table1,sZD5CTBLUdV55xWWkmkI5rb1mJM=,1300587568567.8a84acf58dd3d684ccaa47d4fb4fd53a. because regionserver60020.cacheFlusher; priority=-8, compaction queue size=1755 2011-03-22 00:00:07,183 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up with memory above low water. 2011-03-22 00:00:07,186 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Under global heap pressure: Region table1,KGBJhl9RON29fT0hhak5-tBc-zs=,1300521656641.56e3a141164b546ae84d57e46a513922. has too many store files, but is 6.2m vs best flushable region's 2.1m. Choosing the bigger. 2011-03-22 00:00:07,186 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region table1,KGBJhl9RON29fT0hhak5-tBc-zs=,1300521656641.56e3a141164b546ae84d57e46a513922. due to global heap pressure 2011-03-22 00:00:07,186 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for table1,KGBJhl9RON29fT0hhak5-tBc-zs=,1300521656641.56e3a141164b546ae84d57e46a513922., current region memstore size 6.2m 2011-03-22 00:00:07,201 INFO org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: Using syncFs -- HDFS-200 2011-03-22 00:00:07,241 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/cloud138,60020,1300712706331/cloud138%3A60020.1300723196796, entries=119, filesize=67903254. New hlog /hbase/.logs/cloud138,60020,1300712706331/cloud138%3A60020.1300723207156 2011-03-22 00:00:07,241 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs: logs=398, maxlogs=2; forcing flush of 1 regions(s): 334c81997502eb3c66c2bb9b47a87bcc 2011-03-22 00:00:07,242 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished snapshotting, commencing flushing stores 2011-03-22 00:00:07,577 INFO org.apache.hadoop.hbase.regionserver.Store: Renaming flushed file at hdfs://cloud137:9000/hbase/table1/56e3a141164b546ae84d57e46a513922/.tmp/907665384208923152 to hdfs://cloud137:9000/hbase/table1/56e3a141164b546ae84d57e46a513922/cfEStore/2298819588481793315 2011-03-22 00:00:07,589 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://cloud137:9000/hbase/table1/56e3a141164b546ae84d57e46a513922/cfEStore/2298819588481793315, entries=6, sequenceid=2229486, memsize=6.2m, filesize=6.2m 2011-03-22 00:00:07,591 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.2m for region table1,KGBJhl9RON29fT0hhak5-tBc-zs=,1300521656641.56e3a141164b546ae84d57e46a513922. in 405ms, sequenceid=2229486, compaction requested=true ====
Does this mean we have too many request for the regionsever to catch up with the hlogs' increasement? On the other hand, if there are too many files under /hbase/.logs, when I was trying to restart the master, there are over thousands of threads of class DataStreamer and ResponseProcessor, which are trying to handle the hlogs. Then quickly, the master turns to OOME, any way to control this situation? On Fri, Mar 18, 2011 at 12:20 AM, Jean-Daniel Cryans <[email protected]>wrote: > You can limit the number of WALs and their size on the region server by > tuning: > > hbase.regionserver.maxlogs the default is 32 > hbase.regionserver.hlog.blocksize the default is whatever your HDFS > blocksize times 0.95 > > You can limit the number of parallel threads in the master by tuning: > > hbase.regionserver.hlog.splitlog.writer.threads the default is 3 > hbase.regionserver.hlog.splitlog.buffersize the default is 1024*1024*!28 > > J-D > > On Wed, Mar 16, 2011 at 11:57 PM, 茅旭峰 <[email protected]> wrote: > > Hi, > > > > In our tests, we've accumulated lots of WAL logs, in .logs, which leads > to > > quite long time pause or even > > OOME when restarting either master or region server. We're doing sort of > > bulk import and have not using > > bulk import tricks, like turning off WAL feature. We think it's unknown > how > > our application really use the > > hbase, so it is possible that users doing batch import unless we're > running > > out of space. I wonder if there > > is any property to set to control the size of WAL, would setting smaller > > 'hbase.regionserver.logroll.period' > > help? > > > > On the other hand, since we have lots of regions, the master is easy to > run > > into OOME, due to the occupied > > memory by the instance of Assignment.regions. When we were trying to > restart > > the master, it always died > > with OOME. I think, from the hprof file, it is because the instance of > > HLogSplitter$OutputSink holds too many > > HLogSplitter$WriterAndPaths in logWriters, which even hold the buffer of > > wal.SequenceFileLogWriter. > > Is there any trick to avoid such kind of scenario? > > > > Thanks and regards, > > > > Mao Xu-Feng > > >
