deepankar created HBASE-16630:
---------------------------------

             Summary: Fragmentation in long running Bucket Cache
                 Key: HBASE-16630
                 URL: https://issues.apache.org/jira/browse/HBASE-16630
             Project: HBase
          Issue Type: Bug
          Components: BucketCache
    Affects Versions: 1.2.3, 1.1.6, 2.0.0, 1.3.1
            Reporter: deepankar
            Assignee: deepankar


As we are running bucket cache for a long time in our system, we are observing 
cases where some nodes after some time does not fully utilize the bucket cache, 
in some cases it is even worse in the sense they get stuck at a value < 0.25 % 
of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables are configured 
in-memory for simplicity sake).

We took a heap dump and analyzed what is happening and saw that is classic case 
of fragmentation, current implementation of BucketCache (mainly 
BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
switching/adjusting cache usage between different bucketSizes . But once a 
compaction / bulkload happens and the blocks are evicted from a bucket size , 
these are usually evicted from random places of the buckets of a bucketSize and 
thus locking the number of buckets associated with a bucketSize and in the 
worst case of the fragmentation we have seen some bucketSizes with occupancy 
ratio of <  10 % But they dont have any completelyFreeBuckets to share with the 
other bucketSize. 

Currently the existing eviction logic helps in the cases where cache used is 
more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also done, 
the eviction (freeSpace function) will not evict anything and the cache 
utilization will be stuck at that value without any allocations for other 
required sizes.

The fix for this we came up with is simple that we do deFragmentation ( 
compaction) of the bucketSize and thus increasing the occupancy ratio and also 
freeing up the buckets to be fullyFree, this logic itself is not complicated as 
the bucketAllocator takes care of packing the blocks in the buckets, we need 
evict and re-allocate the blocks for all the BucketSizes that dont fit the 
criteria.

I am attaching an initial patch just to give an idea of what we are thinking 
and I'll improve it based on the comments from the community.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to