[
https://issues.apache.org/jira/browse/HBASE-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12998125#comment-12998125
]
Marc Limotte commented on HBASE-3551:
-------------------------------------
Here's some more detail about the situation that Stack and I saw:
>From region server UI (via lynx)
HBase Version 0.90.0, r0b7903c50eef589c632582f7d9d6364eb3912c38 HBase version
and svn revision
HBase Compiled Mon Jan 24 20:44:24 UTC 2011, root When HBase version was
compiled and by whom
Metrics request=0.0, regions=107, stores=214, storefiles=381,
storefileIndexSize=2983, memstoreSize=0,
compactionQueueSize=29, usedHeap=3774, maxHeap=7141, blockCacheSize=509777848,
blockCacheFree=987798472,
blockCacheCount=7557, blockCacheHitCount=60151, blockCacheMissCount=38698247,
blockCacheEvictedCount=0,
blockCacheHitRatio=0, blockCacheHitCachingRatio=88 RegionServer Metrics; file
and heap sizes are in megabytes
Zookeeper Quorum ip-xxxxxxxxx.ec2.internal:2181 Addresses of all registered ZK
servers
So, almost 3gb for the index
1-2 stores per region, storefile-size = 1gb, hbase block size = 64k
num-of-entries-per-storefile = storefile-size / hbase-block-size
estimated index size = num-of-entries-per-storefile * num-store-files *
key-and-entry-size
key-and-entry-size = 20 to 200 => 150 (guess)
estimated index size = (1G / 64K) * 381 * 150 = 900M (much less than 2983M)
This doesn't account for any overhead in the index, but it's hard to imaging
that the overhead would account for 3X size difference.
Also, our compaction queue is fairly deep (due to forced major compactions).
What impact could that have storefileIndexSize?
> Loaded hfile indexes occupy a good chunk of heap; look into shrinking the
> amount used and/or evicting unused indices
> --------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-3551
> URL: https://issues.apache.org/jira/browse/HBASE-3551
> Project: HBase
> Issue Type: Improvement
> Reporter: stack
>
> I hung with a user Marc and we were looking over configs and his cluster
> profile up on ec2. One thing we noticed was that his 100+ 1G regions of two
> families had ~2.5G of heap resident. We did a bit of math and couldn't get
> to 2.5G so that needs looking into. Even still, 2.5G is a bunch of heap to
> give over to indices (He actually OOME'd when he had his RS heap set to just
> 3G; we shouldn't OOME, we should just run slower). It sounds like he needs
> the indices loaded but still, for some cases we should drop indices for
> unaccessed files.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira