Wellington Chevreuil created HBASE-30135:
--------------------------------------------
Summary: Improve CacheAwareLoadBalancer to simulate low cache
ratio regions as cached in candidate servers with enough cache space
Key: HBASE-30135
URL: https://issues.apache.org/jira/browse/HBASE-30135
Project: HBase
Issue Type: Sub-task
Reporter: Wellington Chevreuil
Assignee: Wellington Chevreuil
When enabling time based priority on CFs that are already cached on clusters
with datasets larger than the cache capacity, once the now cold data gets
evicted, hot data that was previously uncached due to no cache space would
remain uncached unless:
1) A client read request read it from the file system and caches it;
2) An operator manually disables and reenables tables with hot data, so that
the prefetch executor can run and cache those blocks.
Both options are non optimal, leading to temporary performance impacts and/or
requiring manual interventions.
The CacheAwareLoadBalancer, currently, only raises the cost of moving highly
cached regions, when calculating assignment plans, and doesn't consider cache
ratio at all when calculating potential imbalance, only skewness. So in the
scenario where regions are evenly distributed and there's no skewness,
CacheAwareLoadBalancer would not trigger any moves even when there would be
regions with low cache ratio on many servers and enough cache space to
accommodate those regions data in region servers cache.
This proposal is to include low cached ratio regions in the imbalance
calculation, so that CacheAwareLoadBalancer can trigger the computation of new
assignment plans.
It also needs to recalculate cache ratio on the assignment plans, simulating
that low cache ratio regions moved to enough cache free space servers would get
fully cached, so that such plan scores higher then the current state.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)