Regards, Otis.
SPM is free to use?

Best wishes
On 09/25/2012 11:50 PM, Otis Gospodnetic wrote:
Hi,

And the servers are identical in terms of hardware, JVM, etc.?
What does your monitoring tool show is different on problematic
machines, other than the load?  How's JVM heap and GC looking?  Have a
look at SPM for HBase (see sig).  You could install the agent/client
on the problematic servers + a couple non-problematic servers and
visually compare a bunch of different HBase/JVM/OS metrics.

HTH
Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html


On Tue, Sep 25, 2012 at 11:45 PM, Yusup Ashrap <[email protected]> wrote:
Hi all , I am new to hbase. I have a 30+ nodes sized cluster running up in
the production environment.
problem is that several nodes  of cluster suffer  from occassional  high
loads( 30-60) .twice or three times a day and it happens every day.
It's kinda emergeny situation for me now and I dont have too much time to
dive in hbase to figure out what's wrong with my cluster.
I hope someone help me  point out  what is or could be  wrong with my
cluster or any steps to how to find out the problem , thanks


*hbase version*:0.90.2, r
*region server metrics*:
request=3256.1, readRequest=26313.0, readDataSize=2173227.0,
failedReadRequest=0.0, writeRequest=6253.0, writeDataSize=4273037.0,
failedWriteRequest=0.0, readResponseTime=29935.0, writeResponseTime=9298.0,
memStoreHitCount=0.0, memStoreMissCount=16413.0, rpcRequestCount=31877.0,
rpcRequestTime=790.0, aliveHandlerNum=200.0, aliveReaderNum=10.0,
handlerQueueSize=0.0, regions=338, stores=426, storefiles=354,
storefileSize=108349, storefileIndexSize=1116, memstoreSize=1550,
compactionQueueSize=0, flushQueueSize=0, usedHeap=10217, maxHeap=15872,
blockCacheSize=5122994416, blockCacheFree=1534205200,
blockCacheCount=71481, blockCacheHitCount=36328173832,
metaBlockCacheHitCount=23066398881, dataBlockCacheHitCount=13261774951,
blockCacheMissCount=2133761468, metaBlockCacheMissCount=219363,
dataBlockCacheMissCount=2133542105, blockCacheEvictedCount=1634548925,
blockCacheHitRatio=94, metaBlockCacheHitRatio=99,
dataBlockCacheHitRatio=86, blockCacheHitCachingRatio=95,
hdfsBlocksLocalityIndex=86

here is my region server( load avg 30+ )'s log:

2012-09-26 11:04:50,185 INFO org.apache.hadoop.hbase.regionserver.Store:
Added
hdfs://hdfs_xxx:9516/hbase/table_a/232d346a258d160bc61618bafe91f047/info/865605993795928725,
entries=79472, sequenceid=11417919090, memsize=21.7m, filesize=1.2m
2012-09-26 11:04:50,187 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Finished memstore flush of ~21.7m for region
table_a,10396613269\x01NLKMLMHFGFOOO\x01NLKMLMHFGFLFM\x01AntispamPunisher\x01011\x01000000,1345016652050.232d346a258d160bc61618bafe91f047.
in 752ms, sequenceid=11417919090, compaction requested=true
2012-09-26 11:04:50,187 DEBUG
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction
requested for
table_a,10396613269\x01NLKMLMHFGFOOO\x01NLKMLMHFGFLFM\x01AntispamPunisher\x01011\x01000000,1345016652050.232d346a258d160bc61618bafe91f047.
because regionserver60020.cacheFlusher; priority=16, compaction queue
size=19
2012-09-26 11:05:01,431 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
started; Attempting to free 634.93 MB of total=5.27 GB
2012-09-26 11:05:01,505 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
completed; freed=634.96 MB, total=4.65 GB, single=1.43 GB, multi=3.79 GB,
memory=0 KB
2012-09-26 11:05:01,544 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2012-09-26 11:05:21,256 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
started; Attempting to free 634.92 MB of total=5.27 GB
2012-09-26 11:05:21,333 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
completed; freed=634.94 MB, total=4.65 GB, single=1.43 GB, multi=3.79 GB,
memory=0 KB
2012-09-26 11:05:40,451 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
started; Attempting to free 634.93 MB of total=5.27 GB
2012-09-26 11:05:40,515 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
completed; freed=634.99 MB, total=4.65 GB, single=1.42 GB, multi=3.8 GB,
memory=0 KB
2012-09-26 11:05:59,978 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
started; Attempting to free 634.93 MB of total=5.27 GB
2012-09-26 11:06:00,051 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
completed; freed=634.99 MB, total=4.65 GB, single=1.42 GB, multi=3.8 GB,
memory=0 KB
2012-09-26 11:06:07,836 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2012-09-26 11:06:18,443 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
started; Attempting to free 634.91 MB of total=5.27 GB
2012-09-26 11:06:18,505 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
completed; freed=634.93 MB, total=4.65 GB, single=1.43 GB, multi=3.79 GB,
memory=0 KB
2012-09-26 11:06:38,674 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
started; Attempting to free 634.89 MB of total=5.27 GB
2012-09-26 11:06:38,739 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
completed; freed=634.93 MB, total=4.65 GB, single=1.43 GB, multi=3.79 GB,
memory=0 KB
2012-09-26 11:06:50,616 INFO
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: Using
syncFs -- HDFS-200
2012-09-26 11:06:50,630 INFO org.apache.hadoop.hbase.regionserver.wal.HLog:
Roll
/hbase/.logs/my226021.cm4,60020,1333724247105/my226021.cm4%3A60020.1348628673307,
entries=81412, filesize=63758937. New hlog
/hbase/.logs/my226021.cm4,60020,1333724247105/my226021.cm4%3A60020.1348628810604
2012-09-26 11:06:50,659 INFO org.apache.hadoop.hbase.regionserver.wal.HLog:
Too many hlogs: logs=33, maxlogs=32; forcing flush of 45 regions(s):
016138b1500faade07bc92694f222dd0, 064d4407646b317e7cfd4c35efa0ecfd,
07a185ab3fd56c75ee51dadc75911d55, 0842a4e65aea6d65c1d3ce4425ab8ef0,
0cfafa661d1819b4e81a4729c89df010, 1c435cddad229f4bf8283f4642cb5f0d,
293f2e6c4925f65e370015784a1392a9, 363e160b42f4025a58f074260f1161ce,
37e28e04133dfd3a74a6acb7ee5a45f8, 38766d1c77e373fb476fa9854443531c,
3f4a868b7233ec6fe8be88e3c2ca73ba, 45cf01b2f8744b033087281308862859,
468933ece8913bd7e06c2a6cbd9dc5e6, 480cc150655848e55217b3c63c2ea749,
4c4c7beee4eeb8e3d4d1e689aa16ae39, 5045e2289406a2ebe6fda7d15cd64e4d,
519a99c62386004dc6fb22308c6b0ac0, 578a6c0d02b8cc0807e60b452c1e5add,
5c130a1e57b5dc85b40bc7dd10b4aa9b, 6259605302af1b5ef3accd2c34aeecd7,
685370c4349d78fd04efbe071183cc0e, 69080c7e1d14fefccc5e106e955e405d,
6d2d97fcef2886ce2936cbf611648de7, 855e3f6a7eaa27b6a6952a33c06f47c9,
875b26219f934896756b3f80fd7e7a75, 8fd673d4475a0294845a8681253432ff,
9993f31115e8e59450f833771b03a95e, a0cb8dd41fa75f46fc9917c53f7b3b92,
a6b0b04c13f22dd20b5fd5c3a0db5270, a8d401355719a12d9f7d2881a4be9fc8,
ad6625d0afe0199eb7ba05c07f0a87a3, b845d168b3ab52fe5ae9e2e4366098fc,
b9f1139ea5a2e9b512a3e1a173212d70, bec996064a36cc306af18f771965679c,
c744a27faf49953ea5aa17708e4aefdd, cab712596aeaa41fc6465696ed4ad74a,
cf83ec10220cd516ba55a0b2a5c2f711, d00c2663772b4f453a4c946032de66df,
d7e596ea520902f4ed35ce28449a0593, d9058055eab307828c4bb41fab664ab5,
e43b26752173becfdffbbc75c70f849e, ecc4925b7791123cc452a39c542cbf9f,
efbb1004f1b68c98350fbe38b343a308, f1288916c8b877ccf6cac252e1af0a18,
fc5446f5a44ca115ba63a21dca420c64
2012-09-26 11:06:50,659 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
Started memstore flush for
table_a,10904703985\x01NLKMHOJIOHOOO\x01NLKMHOJIOHHIM\x01PriceDispPunisher\x01031\x01000000,1345202033439.016138b1500faade07bc92694f222dd0.,
current region memstore size 24.9m
2012-09-26 11:06:50,659 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
Finished snapshotting, commencing flushing stores
2012-09-26 11:06:51,389 INFO
org.apache.hadoop.hbase.regionserver.StoreFile: Bloom added to HFile
(hdfs://hdfs_xxx:9516/hbase/table_a/016138b1500faade07bc92694f222dd0/.tmp/4855940649973018735):
13.4k, 5703/11406 (50%)
2012-09-26 11:06:52,055 INFO org.apache.hadoop.hbase.regionserver.Store:
Renaming flushed file at
hdfs://hdfs_xxx:9516/hbase/table_a/016138b1500faade07bc92694f222dd0/.tmp/4855940649973018735
to
hdfs://hdfs_xxx:9516/hbase/table_a/016138b1500faade07bc92694f222dd0/info/6325203002828183467
2012-09-26 11:06:52,070 INFO
org.apache.hadoop.hbase.regionserver.StoreFile$Reader: Loaded row bloom
filter metadata for
hdfs://hdfs_xxx:9516/hbase/table_a/016138b1500faade07bc92694f222dd0/info/6325203002828183467
2012-09-26 11:06:52,070 INFO org.apache.hadoop.hbase.regionserver.Store:
Added
hdfs://hdfs_xxx:9516/hbase/table_a/016138b1500faade07bc92694f222dd0/info/6325203002828183467,
entries=91248, sequenceid=11417991467, memsize=24.9m, filesize=1.4m
2012-09-26 11:06:52,071 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Finished memstore flush of ~24.9m for region
table_a,10904703985\x01NLKMHOJIOHOOO\x01NLKMHOJIOHHIM\x01PriceDispPunisher\x01031\x01000000,1345202033439.016138b1500faade07bc92694f222dd0.
in 1412ms, sequenceid=11417991467, compaction requested=true
2012-09-26 11:06:52,071 DEBUG
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction
requested for
table_a,10904703985\x01NLKMHOJIOHOOO\x01NLKMHOJIOHHIM\x01PriceDispPunisher\x01031\x01000000,1345202033439.016138b1500faade07bc92694f222dd0.
because regionserver60020.cacheFlusher; priority=17, compaction queue
size=20
2012-09-26 11:06:52,071 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
Started memstore flush for
table_a,14095325123\x01NLKLKJFHIOOOO\x01NLKLKJFHIOHII\x01AntispamPunisher\x01001\x01000000,1345106676203.064d4407646b317e7cfd4c35efa0ecfd.,
current region memstore size 18.0m
2012-09-26 11:06:52,071 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
Finished snapshotting, commencing flushing stores
2012-09-26 11:06:52,599 INFO
org.apache.hadoop.hbase.regionserver.StoreFile: Bloom added to HFile
(hdfs://hdfs_xxx:9516/hbase/table_a/064d4407646b317e7cfd4c35efa0ecfd/.tmp/3436108128903667477):
9.7k, 4124/8248 (50%)
2012-09-26 11:06:52,622 INFO org.apache.hadoop.hbase.regionserver.Store:
Renaming flushed file at
hdfs://hdfs_xxx:9516/hbase/table_a/064d4407646b317e7cfd4c35efa0ecfd/.tmp/3436108128903667477
to
hdfs://hdfs_xxx:9516/hbase/table_a/064d4407646b317e7cfd4c35efa0ecfd/info/2027379172809216464
2012-09-26 11:06:52,637 INFO
org.apache.hadoop.hbase.regionserver.StoreFile$Reader: Loaded row bloom
filter metadata for
hdfs://hdfs_xxx:9516/hbase/table_a/064d4407646b317e7cfd4c35efa0ecfd/info/2027379172809216464
2012-09-26 11:06:52,637 INFO org.apache.hadoop.hbase.regionserver.Store:
Added
hdfs://hdfs_xxx:9516/hbase/table_a/064d4407646b317e7cfd4c35efa0ecfd/info/2027379172809216464,
entries=65984, sequenceid=11417992318, memsize=18.0m, filesize=1.0m
10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci

--

Marcos Luis Ortíz Valmaseda
*Data Engineer && Sr. System Administrator at UCI*
about.me/marcosortiz <http://about.me/marcosortiz>
My Blog <http://marcosluis2186.posterous.com>
Tumblr's blog <http://marcosortiz.tumblr.com/>
@marcosluis2186 <http://twitter.com/marcosluis2186>



10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci

Reply via email to