Janardhan Hungund created HBASE-28463:
-----------------------------------------
Summary: Time Based Priority for BucketCache
Key: HBASE-28463
URL: https://issues.apache.org/jira/browse/HBASE-28463
Project: HBase
Issue Type: New Feature
Components: BucketCache
Reporter: Janardhan Hungund
This Jira introduces the feature of time-based data tiering in HBase to
optimize storage efficiency and access performance by segregating data based on
its recency. By keeping recent data in the bucket cache (backed by faster
storage types like SSDs) and evicting older data, the system aims to provide a
more flexible control over the cache allocation and eviction logic via
configuration, allowing for defining time priorities for cached data.
The need for a more extensive cache allocation mechanism becomes even more
critical on HBase deployments where cache access reflects on significant
performance gains, such as when using cloud storage as the underlying file
system.
The data is segregated into hot or cold categories based on its age. The recent
data within a specific time range (configured as hot-data-age) is treated as
hot and is stored in the ephemeral cache, while the older data is stored and
accessed from the cloud storage.
This feature intends to provide the TCO gains by optimizing the utilization of
high cost bucket cache. Perfect fit for the use cases that have the date-based
data writes while the scans focus on the recently written data.
Please find the detailed design document of the feature attached with the Jira.
Thanks,
Janardhan
--
This message was sent by Atlassian Jira
(v8.20.10#820010)