[
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ramkrishna.s.vasudevan resolved HBASE-16630.
--------------------------------------------
Resolution: Fixed
Pushed to all branches including branch-1.2. There was an env issue in my
branch-1.2. I was able to correct that and committed this patch.
Thanks for all the reviews and for your persistence with this patch [~dvdreddy].
> Fragmentation in long running Bucket Cache
> ------------------------------------------
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
> Issue Type: Bug
> Components: BucketCache
> Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
> Reporter: deepankar
> Assignee: deepankar
> Priority: Critical
> Fix For: 2.0.0, 1.4.0, 1.3.1, 1.2.6
>
> Attachments: 16630-v2-suggest.patch, 16630-v3-suggest.patch,
> HBASE-16630.patch, HBASE-16630-v2.patch, HBASE-16630-v3-branch-1.patch,
> HBASE-16630-v3-branch-1.X.patch, HBASE-16630-v3.patch,
> HBASE-16630-v4-branch-1.X.patch
>
>
> As we are running bucket cache for a long time in our system, we are
> observing cases where some nodes after some time does not fully utilize the
> bucket cache, in some cases it is even worse in the sense they get stuck at a
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic
> case of fragmentation, current implementation of BucketCache (mainly
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for
> switching/adjusting cache usage between different bucketSizes . But once a
> compaction / bulkload happens and the blocks are evicted from a bucket size ,
> these are usually evicted from random places of the buckets of a bucketSize
> and thus locking the number of buckets associated with a bucketSize and in
> the worst case of the fragmentation we have seen some bucketSizes with
> occupancy ratio of < 10 % But they dont have any completelyFreeBuckets to
> share with the other bucketSize.
> Currently the existing eviction logic helps in the cases where cache used is
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also
> done, the eviction (freeSpace function) will not evict anything and the cache
> utilization will be stuck at that value without any allocations for other
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation (
> compaction) of the bucketSize and thus increasing the occupancy ratio and
> also freeing up the buckets to be fullyFree, this logic itself is not
> complicated as the bucketAllocator takes care of packing the blocks in the
> buckets, we need evict and re-allocate the blocks for all the BucketSizes
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking
> and I'll improve it based on the comments from the community.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)