deepankar updated HBASE-16630:
    Attachment: HBASE-16630-v3.patch

Sorry for delay in adding the suggestion [~tedyu], I attached a patch now which 
in addition to your suggestions contains a couple of import fixes. Also I 
tested the patch on couple of machines, every thing looked fine. we are doing a 
cluster wide deploy today, will report on that results

> Fragmentation in long running Bucket Cache
> ------------------------------------------
>                 Key: HBASE-16630
>                 URL: https://issues.apache.org/jira/browse/HBASE-16630
>             Project: HBase
>          Issue Type: Bug
>          Components: BucketCache
>    Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>            Reporter: deepankar
>            Assignee: deepankar
>         Attachments: 16630-v2-suggest.patch, HBASE-16630-v2.patch, 
> HBASE-16630-v3.patch, HBASE-16630.patch
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.

This message was sent by Atlassian JIRA

Reply via email to