Wellington Chevreuil created HBASE-29707:
--------------------------------------------
Summary: Fix region cache % metrics miss calculation
Key: HBASE-29707
URL: https://issues.apache.org/jira/browse/HBASE-29707
Project: HBase
Issue Type: Bug
Components: BucketCache
Reporter: Wellington Chevreuil
Assignee: Wellington Chevreuil
HBASE-28246 has introduced this metric that tracks the percentage of regions
data that is cached, displaying this info on the RS UI store file metrics tab.
Unfortunately, under the following scenarios, this metric can be miscalculated
and display wrong information:
1) Region compactions: During compactions with cacheCompactedOnWrite set to
true, we cache the new blocks written during compaction, which correctly
updates the related metric, but once compaction is finished and compacted files
have its readers closed, we are evicting the blocks but missing to decrease
those from the metric.
2) Cache of hlinks pointing to files in archive: When caching an HLink for a
file in the archive folder from a region that's still online, we are adding
that file cached size to the original region, which is wrong, since the
archived file blocks are not relevant for the original region, but to the
region containing the link.
In both cases, affected regions will show the “% Cache” metric beyond 100% on
the Web UI, which can be misleading. Also, this metric is used by the
CacheAwareLoadBalancer, and over counting region cache percentage can impact
this balancer efficiency.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)