[
https://issues.apache.org/jira/browse/HBASE-17819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Anoop Sam John updated HBASE-17819:
-----------------------------------
Attachment: HBASE-17819_V3.patch
To let know the approach. This is bit diff from V2 patch. Major changes are
1. BucketEntry is extended to make the SharedMemory BucketEntry. For file
mode, there is no need to keep the ref count as that is not shared memory type.
So I removed those new states added for 11425 from BucketEntry. For off heap
mode BucketEntry, we have an extension now where we have the new states.
2. Removed the CSLM for keeping the HFilename based blocks info. The
evictBlocksByHfileName will have a perf impact as it has to iterate through all
the entries to know each of the block entry belong to this file or not. For
that changed the evictBlocksByHfileName to be an async op way. A dedicated
eviction thread will do this work. ANy way even if we dont remove these blocks
or have delay in removal, eventually these block will get removed as we have
LRU algo for the eviction. So when there are no space left for the new blocks
addition, eviction would happen, removing unused blocks. More over, eviction
of blocks on HFile close is default off only (We have a config to turn this
off). When it is compaction , for the compacted files, we have evictByHFiles
happening now. There will be bit more delay for the actual removal of the
blocks.
But we save lot of heap memory per entry now as per this approach. The math is
there in above comment
> Reduce the heap overhead for BucketCache
> ----------------------------------------
>
> Key: HBASE-17819
> URL: https://issues.apache.org/jira/browse/HBASE-17819
> Project: HBase
> Issue Type: Sub-task
> Components: BucketCache
> Reporter: Anoop Sam John
> Assignee: Anoop Sam John
> Priority: Major
> Fix For: 2.0.0
>
> Attachments: HBASE-17819_V1.patch, HBASE-17819_V2.patch,
> HBASE-17819_V3.patch
>
>
> We keep Bucket entry map in BucketCache. Below is the math for heapSize for
> the key , value into this map.
> BlockCacheKey
> ---------------
> String hfileName - Ref - 4
> long offset - 8
> BlockType blockType - Ref - 4
> boolean isPrimaryReplicaBlock - 1
> Total = 12 (Object) + 17 = 29
> BucketEntry
> ------------
> int offsetBase - 4
> int length - 4
> byte offset1 - 1
> byte deserialiserIndex - 1
> long accessCounter - 8
> BlockPriority priority - Ref - 4
> volatile boolean markedForEvict - 1
> AtomicInteger refCount - 16 + 4
> long cachedTime - 8
> Total = 12 (Object) + 51 = 63
> ConcurrentHashMap Map.Entry - 40
> blocksByHFile ConcurrentSkipListSet Entry - 40
> Total = 29 + 63 + 80 = 172
> For 10 million blocks we will end up having 1.6GB of heap size.
> This jira aims to reduce this as much as possible
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)