Hi all , I am new to hbase. I have a 30+ nodes sized cluster running up in
the production environment.
problem is that several nodes of cluster suffer from occassional high
loads( 30-60) .twice or three times a day and it happens every day.
It's kinda emergeny situation for me now and I dont have too much time to
dive in hbase to figure out what's wrong with my cluster.
I hope someone help me point out what is or could be wrong with my
cluster or any steps to how to find out the problem , thanks
*hbase version*:0.90.2, r
*region server metrics*:
request=3256.1, readRequest=26313.0, readDataSize=2173227.0,
failedReadRequest=0.0, writeRequest=6253.0, writeDataSize=4273037.0,
failedWriteRequest=0.0, readResponseTime=29935.0, writeResponseTime=9298.0,
memStoreHitCount=0.0, memStoreMissCount=16413.0, rpcRequestCount=31877.0,
rpcRequestTime=790.0, aliveHandlerNum=200.0, aliveReaderNum=10.0,
handlerQueueSize=0.0, regions=338, stores=426, storefiles=354,
storefileSize=108349, storefileIndexSize=1116, memstoreSize=1550,
compactionQueueSize=0, flushQueueSize=0, usedHeap=10217, maxHeap=15872,
blockCacheSize=5122994416, blockCacheFree=1534205200,
blockCacheCount=71481, blockCacheHitCount=36328173832,
metaBlockCacheHitCount=23066398881, dataBlockCacheHitCount=13261774951,
blockCacheMissCount=2133761468, metaBlockCacheMissCount=219363,
dataBlockCacheMissCount=2133542105, blockCacheEvictedCount=1634548925,
blockCacheHitRatio=94, metaBlockCacheHitRatio=99,
dataBlockCacheHitRatio=86, blockCacheHitCachingRatio=95,
hdfsBlocksLocalityIndex=86
here is my region server( load avg 30+ )'s log:
2012-09-26 11:04:50,185 INFO org.apache.hadoop.hbase.regionserver.Store:
Added
hdfs://hdfs_xxx:9516/hbase/table_a/232d346a258d160bc61618bafe91f047/info/865605993795928725,
entries=79472, sequenceid=11417919090, memsize=21.7m, filesize=1.2m
2012-09-26 11:04:50,187 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Finished memstore flush of ~21.7m for region
table_a,10396613269\x01NLKMLMHFGFOOO\x01NLKMLMHFGFLFM\x01AntispamPunisher\x01011\x01000000,1345016652050.232d346a258d160bc61618bafe91f047.
in 752ms, sequenceid=11417919090, compaction requested=true
2012-09-26 11:04:50,187 DEBUG
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction
requested for
table_a,10396613269\x01NLKMLMHFGFOOO\x01NLKMLMHFGFLFM\x01AntispamPunisher\x01011\x01000000,1345016652050.232d346a258d160bc61618bafe91f047.
because regionserver60020.cacheFlusher; priority=16, compaction queue
size=19
2012-09-26 11:05:01,431 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
started; Attempting to free 634.93 MB of total=5.27 GB
2012-09-26 11:05:01,505 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
completed; freed=634.96 MB, total=4.65 GB, single=1.43 GB, multi=3.79 GB,
memory=0 KB
2012-09-26 11:05:01,544 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2012-09-26 11:05:21,256 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
started; Attempting to free 634.92 MB of total=5.27 GB
2012-09-26 11:05:21,333 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
completed; freed=634.94 MB, total=4.65 GB, single=1.43 GB, multi=3.79 GB,
memory=0 KB
2012-09-26 11:05:40,451 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
started; Attempting to free 634.93 MB of total=5.27 GB
2012-09-26 11:05:40,515 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
completed; freed=634.99 MB, total=4.65 GB, single=1.42 GB, multi=3.8 GB,
memory=0 KB
2012-09-26 11:05:59,978 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
started; Attempting to free 634.93 MB of total=5.27 GB
2012-09-26 11:06:00,051 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
completed; freed=634.99 MB, total=4.65 GB, single=1.42 GB, multi=3.8 GB,
memory=0 KB
2012-09-26 11:06:07,836 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2012-09-26 11:06:18,443 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
started; Attempting to free 634.91 MB of total=5.27 GB
2012-09-26 11:06:18,505 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
completed; freed=634.93 MB, total=4.65 GB, single=1.43 GB, multi=3.79 GB,
memory=0 KB
2012-09-26 11:06:38,674 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
started; Attempting to free 634.89 MB of total=5.27 GB
2012-09-26 11:06:38,739 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
completed; freed=634.93 MB, total=4.65 GB, single=1.43 GB, multi=3.79 GB,
memory=0 KB
2012-09-26 11:06:50,616 INFO
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: Using
syncFs -- HDFS-200
2012-09-26 11:06:50,630 INFO org.apache.hadoop.hbase.regionserver.wal.HLog:
Roll
/hbase/.logs/my226021.cm4,60020,1333724247105/my226021.cm4%3A60020.1348628673307,
entries=81412, filesize=63758937. New hlog
/hbase/.logs/my226021.cm4,60020,1333724247105/my226021.cm4%3A60020.1348628810604
2012-09-26 11:06:50,659 INFO org.apache.hadoop.hbase.regionserver.wal.HLog:
Too many hlogs: logs=33, maxlogs=32; forcing flush of 45 regions(s):
016138b1500faade07bc92694f222dd0, 064d4407646b317e7cfd4c35efa0ecfd,
07a185ab3fd56c75ee51dadc75911d55, 0842a4e65aea6d65c1d3ce4425ab8ef0,
0cfafa661d1819b4e81a4729c89df010, 1c435cddad229f4bf8283f4642cb5f0d,
293f2e6c4925f65e370015784a1392a9, 363e160b42f4025a58f074260f1161ce,
37e28e04133dfd3a74a6acb7ee5a45f8, 38766d1c77e373fb476fa9854443531c,
3f4a868b7233ec6fe8be88e3c2ca73ba, 45cf01b2f8744b033087281308862859,
468933ece8913bd7e06c2a6cbd9dc5e6, 480cc150655848e55217b3c63c2ea749,
4c4c7beee4eeb8e3d4d1e689aa16ae39, 5045e2289406a2ebe6fda7d15cd64e4d,
519a99c62386004dc6fb22308c6b0ac0, 578a6c0d02b8cc0807e60b452c1e5add,
5c130a1e57b5dc85b40bc7dd10b4aa9b, 6259605302af1b5ef3accd2c34aeecd7,
685370c4349d78fd04efbe071183cc0e, 69080c7e1d14fefccc5e106e955e405d,
6d2d97fcef2886ce2936cbf611648de7, 855e3f6a7eaa27b6a6952a33c06f47c9,
875b26219f934896756b3f80fd7e7a75, 8fd673d4475a0294845a8681253432ff,
9993f31115e8e59450f833771b03a95e, a0cb8dd41fa75f46fc9917c53f7b3b92,
a6b0b04c13f22dd20b5fd5c3a0db5270, a8d401355719a12d9f7d2881a4be9fc8,
ad6625d0afe0199eb7ba05c07f0a87a3, b845d168b3ab52fe5ae9e2e4366098fc,
b9f1139ea5a2e9b512a3e1a173212d70, bec996064a36cc306af18f771965679c,
c744a27faf49953ea5aa17708e4aefdd, cab712596aeaa41fc6465696ed4ad74a,
cf83ec10220cd516ba55a0b2a5c2f711, d00c2663772b4f453a4c946032de66df,
d7e596ea520902f4ed35ce28449a0593, d9058055eab307828c4bb41fab664ab5,
e43b26752173becfdffbbc75c70f849e, ecc4925b7791123cc452a39c542cbf9f,
efbb1004f1b68c98350fbe38b343a308, f1288916c8b877ccf6cac252e1af0a18,
fc5446f5a44ca115ba63a21dca420c64
2012-09-26 11:06:50,659 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
Started memstore flush for
table_a,10904703985\x01NLKMHOJIOHOOO\x01NLKMHOJIOHHIM\x01PriceDispPunisher\x01031\x01000000,1345202033439.016138b1500faade07bc92694f222dd0.,
current region memstore size 24.9m
2012-09-26 11:06:50,659 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
Finished snapshotting, commencing flushing stores
2012-09-26 11:06:51,389 INFO
org.apache.hadoop.hbase.regionserver.StoreFile: Bloom added to HFile
(hdfs://hdfs_xxx:9516/hbase/table_a/016138b1500faade07bc92694f222dd0/.tmp/4855940649973018735):
13.4k, 5703/11406 (50%)
2012-09-26 11:06:52,055 INFO org.apache.hadoop.hbase.regionserver.Store:
Renaming flushed file at
hdfs://hdfs_xxx:9516/hbase/table_a/016138b1500faade07bc92694f222dd0/.tmp/4855940649973018735
to
hdfs://hdfs_xxx:9516/hbase/table_a/016138b1500faade07bc92694f222dd0/info/6325203002828183467
2012-09-26 11:06:52,070 INFO
org.apache.hadoop.hbase.regionserver.StoreFile$Reader: Loaded row bloom
filter metadata for
hdfs://hdfs_xxx:9516/hbase/table_a/016138b1500faade07bc92694f222dd0/info/6325203002828183467
2012-09-26 11:06:52,070 INFO org.apache.hadoop.hbase.regionserver.Store:
Added
hdfs://hdfs_xxx:9516/hbase/table_a/016138b1500faade07bc92694f222dd0/info/6325203002828183467,
entries=91248, sequenceid=11417991467, memsize=24.9m, filesize=1.4m
2012-09-26 11:06:52,071 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Finished memstore flush of ~24.9m for region
table_a,10904703985\x01NLKMHOJIOHOOO\x01NLKMHOJIOHHIM\x01PriceDispPunisher\x01031\x01000000,1345202033439.016138b1500faade07bc92694f222dd0.
in 1412ms, sequenceid=11417991467, compaction requested=true
2012-09-26 11:06:52,071 DEBUG
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction
requested for
table_a,10904703985\x01NLKMHOJIOHOOO\x01NLKMHOJIOHHIM\x01PriceDispPunisher\x01031\x01000000,1345202033439.016138b1500faade07bc92694f222dd0.
because regionserver60020.cacheFlusher; priority=17, compaction queue
size=20
2012-09-26 11:06:52,071 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
Started memstore flush for
table_a,14095325123\x01NLKLKJFHIOOOO\x01NLKLKJFHIOHII\x01AntispamPunisher\x01001\x01000000,1345106676203.064d4407646b317e7cfd4c35efa0ecfd.,
current region memstore size 18.0m
2012-09-26 11:06:52,071 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
Finished snapshotting, commencing flushing stores
2012-09-26 11:06:52,599 INFO
org.apache.hadoop.hbase.regionserver.StoreFile: Bloom added to HFile
(hdfs://hdfs_xxx:9516/hbase/table_a/064d4407646b317e7cfd4c35efa0ecfd/.tmp/3436108128903667477):
9.7k, 4124/8248 (50%)
2012-09-26 11:06:52,622 INFO org.apache.hadoop.hbase.regionserver.Store:
Renaming flushed file at
hdfs://hdfs_xxx:9516/hbase/table_a/064d4407646b317e7cfd4c35efa0ecfd/.tmp/3436108128903667477
to
hdfs://hdfs_xxx:9516/hbase/table_a/064d4407646b317e7cfd4c35efa0ecfd/info/2027379172809216464
2012-09-26 11:06:52,637 INFO
org.apache.hadoop.hbase.regionserver.StoreFile$Reader: Loaded row bloom
filter metadata for
hdfs://hdfs_xxx:9516/hbase/table_a/064d4407646b317e7cfd4c35efa0ecfd/info/2027379172809216464
2012-09-26 11:06:52,637 INFO org.apache.hadoop.hbase.regionserver.Store:
Added
hdfs://hdfs_xxx:9516/hbase/table_a/064d4407646b317e7cfd4c35efa0ecfd/info/2027379172809216464,
entries=65984, sequenceid=11417992318, memsize=18.0m, filesize=1.0m