[ https://issues.apache.org/jira/browse/HBASE-28463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wellington Chevreuil resolved HBASE-28463. ------------------------------------------ Resolution: Fixed This feature is now merged into master, branch-3, branch-2 and branch-2.6. Thanks for the contributions, [~janardhan.hungund] and [~vinayakhegde], and for the reviews, [~taklwu]! > Time Based Priority for BucketCache > ----------------------------------- > > Key: HBASE-28463 > URL: https://issues.apache.org/jira/browse/HBASE-28463 > Project: HBase > Issue Type: New Feature > Components: BucketCache > Reporter: Janardhan Hungund > Assignee: Wellington Chevreuil > Priority: Major > Labels: pull-request-available > Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2, 2.6.4 > > > This Jira introduces the feature of time-based priority in BucketCache, where > a configurable "age" is used as a threshold limit for data caching. Data > blocks with a more recent age then this limit should be kept in the cache, > while older data would be picked for eviction (or not considered for > caching). The data age based priority would be applied when deciding if a > block should be added to BucketCache (i.e. during reads, writes, compaction > and prefetch), as well as during the cache freeSpace run (mass eviction), > before applying the LRU logic. > Because blocks don't hold any specific meta information other than type, it's > necessary to group blocks of same "age group" on separate files. We already > have DateTieredCompation for that, which allows for grouping blocks according > to its cells timestamps values in different time window groups. > DateTieredCompaction can be configured to provide two windows (one older and > one younger than the threshold limit), so that a cell timestamp based age > priority can be implemented. Additionally, we are extended > DateTieredCompaction so that the "age" value to be used for comparison can be > provided in a pluggable way, giving extra flexibility for different use cases > to implement their own concept of time priority. > The current scope is to allow for data age to be determined in the following > different ways, all configurable: > * Cell timestamps: Uses the timestamp portion of HBase cells for comparing > the data age, requires DateTieredCompaction to be configured to provide two > time windows, one older and one younger than the time limit threshold. > * Custom cell qualifiers: Uses a custom-defined qualifier for comparing the > data age. It uses that value to tier the entire row containing the given > qualifier value. This requires that the custom qualifier be a valid Java long > timestamp, and must use the "new" compaction implementation defined as part > of this feature, the CustomTieredCompaction. > * Custom value provider: Allows for defining a pluggable implementation that > contains the logic for identifying the date value to be used for comparison. > This also requires the "new" compaction implementation defined as part of > this feature, the CustomTieredCompaction. > The initial scope proposed in 2024 was covering the cell timestamp strategy > mentioned above and is detailed in this [design > doc.|https://docs.google.com/document/d/1Qd3kvZodBDxHTFCIRtoePgMbvyuUSxeydi2SEWQFQro/edit?tab=t.0#heading=h.gjdgxs] > The second phase including the two custom strategies mentioned above is > detailed in [this separate design > doc.|https://docs.google.com/document/d/1uBGIO9IQ-FbSrE5dnUMRtQS23NbCbAmRVDkAOADcU_E/edit?usp=sharing] -- This message was sent by Atlassian Jira (v8.20.10#820010)