[
https://issues.apache.org/jira/browse/HBASE-30134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wellington Chevreuil updated HBASE-30134:
-----------------------------------------
Description:
When enabling time based priority on CFs that are already cached on clusters
with datasets larger than the cache capacity, once the now cold data gets
evicted, hot data that was previously uncached due to no cache space would
remain uncached unless:
1) A client read request read it from the file system and caches it;
2) An operator manually disables and reenables tables with hot data, so that
the prefetch executor can run and cache those blocks.
Both options are non optimal, leading to temporary performance impacts and/or
requiring manual interventions.
The CacheAwareLoadBalancer, currently, only raises the cost of moving highly
cached regions, when calculating assignment plans, and doesn't consider cache
ratio at all when calculating potential imbalance, only skewness. So in the
scenario where regions are evenly distributed and there's no skewness,
CacheAwareLoadBalancer would not trigger any moves even when there would be
regions with low cache ratio on many servers and enough cache space to
accommodate those regions data in region servers cache.
The solution for this problem will be split in two jiras, for ease of review.
The first part to be worked here will only include low cached ratio regions in
the imbalance calculation, so that CacheAwareLoadBalancer can trigger the
computation of new assignment plans.
HBASE-30135 should provide the changes needed to recalculate cache ratio on the
assignment plans, simulating that low cache ratio regions moved to enough cache
free space servers would get fully cached, so that such plan scores higher then
the current state.
was:
When enabling time based priority on CFs that are already cached on clusters
with datasets larger than the cache capacity, once the now cold data gets
evicted, hot data that was previously uncached due to no cache space would
remain uncached unless:
1) A client read request read it from the file system and caches it;
2) An operator manually disables and reenables tables with hot data, so that
the prefetch executor can run and cache those blocks.
Both options are non optimal, leading to temporary performance impacts and/or
requiring manual interventions.
The CacheAwareLoadBalancer, currently, only raises the cost of moving highly
cached regions, when calculating assignment plans, and doesn't consider cache
ratio at all when calculating potential imbalance, only skewness. So in the
scenario where regions are evenly distributed and there's no skewness,
CacheAwareLoadBalancer would not trigger any moves even when there would be
regions with low cache ratio on many servers and enough cache space to
accommodate those regions data in region servers cache.
This proposal is to include low cached ratio regions in the imbalance
calculation, so that CacheAwareLoadBalancer can trigger the computation of new
assignment plans.
It also needs to recalculate cache ratio on the assignment plans, simulating
that low cache ratio regions moved to enough cache free space servers would get
fully cached, so that such plan scores higher then the current state.
> Improve CacheAwareLoadBalancer to consider low cache ratio when calculating
> imbalance
> -------------------------------------------------------------------------------------
>
> Key: HBASE-30134
> URL: https://issues.apache.org/jira/browse/HBASE-30134
> Project: HBase
> Issue Type: Sub-task
> Reporter: Wellington Chevreuil
> Assignee: Wellington Chevreuil
> Priority: Major
>
> When enabling time based priority on CFs that are already cached on clusters
> with datasets larger than the cache capacity, once the now cold data gets
> evicted, hot data that was previously uncached due to no cache space would
> remain uncached unless:
> 1) A client read request read it from the file system and caches it;
> 2) An operator manually disables and reenables tables with hot data, so that
> the prefetch executor can run and cache those blocks.
> Both options are non optimal, leading to temporary performance impacts and/or
> requiring manual interventions.
> The CacheAwareLoadBalancer, currently, only raises the cost of moving highly
> cached regions, when calculating assignment plans, and doesn't consider cache
> ratio at all when calculating potential imbalance, only skewness. So in the
> scenario where regions are evenly distributed and there's no skewness,
> CacheAwareLoadBalancer would not trigger any moves even when there would be
> regions with low cache ratio on many servers and enough cache space to
> accommodate those regions data in region servers cache.
> The solution for this problem will be split in two jiras, for ease of review.
> The first part to be worked here will only include low cached ratio regions
> in the imbalance calculation, so that CacheAwareLoadBalancer can trigger the
> computation of new assignment plans.
> HBASE-30135 should provide the changes needed to recalculate cache ratio on
> the assignment plans, simulating that low cache ratio regions moved to enough
> cache free space servers would get fully cached, so that such plan scores
> higher then the current state.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)