[
https://issues.apache.org/jira/browse/HBASE-26304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17425216#comment-17425216
]
Bryan Beaudreault commented on HBASE-26304:
-------------------------------------------
I submitted the above PR which straight ported the approach I used internally
for refreshing store files after locality had been healed. After thinking about
it some more, I've decided to take this in a different direction:
The LocalityHealer (which moves blocks to requested hosts) itself will end up
being an HDFS project contribution. In the narrow scope of HBase, this issue is
about ensuring a RegionServer can gracefully recover after blocks have been
moved from under it. Given the LocalityHealer will be an HDFS project
contribution, I think ideally the DFSClient itself can gracefully recover from
such an event.
With that in mind, I'm going to try to take a somewhat different approach:
* HDFS-15119 added a basic invalidation of DFSInputStream cached
LocatedBlocks. I'm going to expand upon that so that we can safely and reliably
refresh block locations for DFSInputStreams lacking a local replica:
https://issues.apache.org/jira/browse/HDFS-16262
* Additionally, I'm going to try to add a grace period to block invalidations
in https://issues.apache.org/jira/browse/HDFS-16261. When a block is moved with
REPLACE_BLOCK, the block is invalidated on the old host and asynchronously
deleted. Adding a configurable grace period on the deletion where will give the
above refresh enough time to refresh cached locations and totally skip any pain
related to moving blocks around.
> Reflect out-of-band locality improvements in served requests
> ------------------------------------------------------------
>
> Key: HBASE-26304
> URL: https://issues.apache.org/jira/browse/HBASE-26304
> Project: HBase
> Issue Type: Sub-task
> Reporter: Bryan Beaudreault
> Assignee: Bryan Beaudreault
> Priority: Major
>
> Once the LocalityHealer has improved locality of a StoreFile (by moving
> blocks onto the correct host), the Reader's DFSInputStream and Region's
> localityIndex metric must be refreshed. Without refreshing the
> DFSInputStream, the improved locality will not improve latencies. In fact,
> the DFSInputStream may try to fetch blocks that have moved, resulting in a
> ReplicaNotFoundException. This is automatically retried, but the retry will
> increase long tail latencies relative to configured backoff strategy.
> See https://issues.apache.org/jira/browse/HDFS-16155 for an improvement in
> backoff strategy which can greatly mitigate latency impact of the missing
> block retry.
> Even with that mitigation, a StoreFile is often made up of many blocks.
> Without some sort of intervention, we will continue to hit
> ReplicaNotFoundException over time as clients naturally request data from
> moved blocks.
> In the original LocalityHealer design, I created a new
> RefreshHDFSBlockDistribution RPC on the RegionServer. This RPC accepts a list
> of region names and, for each region store, re-opens the underlying StoreFile
> if the locality has changed.
> I will submit a PR with that implementation, but I am also investigating
> other avenues. For example, I noticed
> https://issues.apache.org/jira/browse/HDFS-15119 which doesn't seem ideal but
> maybe can be improved as an automatic lower-level handling of block moves.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)