[
https://issues.apache.org/jira/browse/HBASE-17819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088838#comment-16088838
]
Anoop Sam John commented on HBASE-17819:
----------------------------------------
These are things trying out
1. We have 2 Enum refs in key and BucketEntry . Changing those to bytes types
and just storing the ordinal. We have few items only in the Enum and byte type
is enough
Result : Saving 6 bytes per entry
2. Changing the BucketEntry so that we have 2 classes for BucketEntries to
IOEngine like File mode and RAM backed IOEngine. Only in RAM backed, we need
ref count way. In file mode, we will remove this state and markedForEvict.
Result : Saving 21 bytes per entry for File mode
3. Changing the refCount type from AtomicInteger to be a volatile int.
AtomicInteger object and its refs in BucketEntry takes 20 bytes where was an
int can work with 4 bytes. On the atomic increment/decrement, we will mimic
what AtomicInteger is doing (Using unsafe CAS)
Result : Saving 16 bytes per entry for RAM backed IOEngine
4. Removing the CSLM for tracking per HFile blocks. So for removing blocks
when an HFile is closed, we will have to iterate over all bucket entries and
check for its HFile and then remove. This is what we do in LRU cache.
Considering this operation not happening in a hot path, it is ok? We are doing
this when CompactedHFilesDischarger runs (in 2 mns interval) and remove all
compacted away files.
Result : Saving 40 bytes per entry
> Reduce the heap overhead for BucketCache
> ----------------------------------------
>
> Key: HBASE-17819
> URL: https://issues.apache.org/jira/browse/HBASE-17819
> Project: HBase
> Issue Type: Sub-task
> Components: BucketCache
> Reporter: Anoop Sam John
> Assignee: Anoop Sam John
> Fix For: 2.0.0
>
>
> We keep Bucket entry map in BucketCache. Below is the math for heapSize for
> the key , value into this map.
> BlockCacheKey
> ---------------
> String hfileName - Ref - 4
> long offset - 8
> BlockType blockType - Ref - 4
> boolean isPrimaryReplicaBlock - 1
> Total = 12 (Object) + 17 = 29
> BucketEntry
> ------------
> int offsetBase - 4
> int length - 4
> byte offset1 - 1
> byte deserialiserIndex - 1
> long accessCounter - 8
> BlockPriority priority - Ref - 4
> volatile boolean markedForEvict - 1
> AtomicInteger refCount - 16 + 4
> long cachedTime - 8
> Total = 12 (Object) + 51 = 63
> ConcurrentHashMap Map.Entry - 40
> blocksByHFile ConcurrentSkipListSet Entry - 40
> Total = 29 + 63 + 80 = 172
> For 10 million blocks we will end up having 1.6GB of heap size.
> This jira aims to reduce this as much as possible
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)