[ 
https://issues.apache.org/jira/browse/HBASE-17819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088838#comment-16088838
 ] 

Anoop Sam John commented on HBASE-17819:
----------------------------------------

These are things trying out
1. We have 2 Enum refs in key and BucketEntry . Changing those to bytes types 
and just storing the ordinal. We have few items only in the Enum and byte type 
is enough   
Result : Saving 6 bytes per entry
2. Changing the BucketEntry so that we have 2 classes for BucketEntries to 
IOEngine like File mode and RAM backed IOEngine. Only in RAM backed, we need 
ref count way.  In file mode, we will remove this state and markedForEvict.
Result : Saving 21 bytes per entry for File mode
3. Changing the refCount type from AtomicInteger to be a volatile int.  
AtomicInteger object and its refs in BucketEntry takes 20 bytes where was an 
int can work with 4 bytes.  On the atomic increment/decrement, we will mimic 
what AtomicInteger is doing (Using unsafe CAS)
Result : Saving 16 bytes per entry for RAM backed IOEngine
4. Removing the CSLM for tracking per HFile blocks.   So for removing blocks 
when an HFile is closed, we will have to iterate over all bucket entries and 
check for its HFile and then remove. This is what we do in LRU cache. 
Considering this operation not happening in a hot path, it is ok? We are doing 
this when CompactedHFilesDischarger runs (in 2 mns interval) and remove all 
compacted away files.
Result : Saving 40 bytes per entry

> Reduce the heap overhead for BucketCache
> ----------------------------------------
>
>                 Key: HBASE-17819
>                 URL: https://issues.apache.org/jira/browse/HBASE-17819
>             Project: HBase
>          Issue Type: Sub-task
>          Components: BucketCache
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>             Fix For: 2.0.0
>
>
> We keep Bucket entry map in BucketCache.  Below is the math for heapSize for 
> the key , value into this map.
> BlockCacheKey
> ---------------
> String hfileName  -  Ref  - 4
> long offset  - 8
> BlockType blockType  - Ref  - 4
> boolean isPrimaryReplicaBlock  - 1
> Total  =  12 (Object) + 17 = 29
> BucketEntry
> ------------
> int offsetBase  -  4
> int length  - 4
> byte offset1  -  1
> byte deserialiserIndex  -  1
> long accessCounter  -  8
> BlockPriority priority  - Ref  - 4
> volatile boolean markedForEvict  -  1
> AtomicInteger refCount  -  16 + 4
> long cachedTime  -  8
> Total = 12 (Object) + 51 = 63
> ConcurrentHashMap Map.Entry  -  40
> blocksByHFile ConcurrentSkipListSet Entry  -  40
> Total = 29 + 63 + 80 = 172
> For 10 million blocks we will end up having 1.6GB of heap size.  
> This jira aims to reduce this as much as possible



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to