[ 
https://issues.apache.org/jira/browse/HBASE-26304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17425216#comment-17425216
 ] 

Bryan Beaudreault commented on HBASE-26304:
-------------------------------------------

I submitted the above PR which straight ported the approach I used internally 
for refreshing store files after locality had been healed. After thinking about 
it some more, I've decided to take this in a different direction:

The LocalityHealer (which moves blocks to requested hosts) itself will end up 
being an HDFS project contribution. In the narrow scope of HBase, this issue is 
about ensuring a RegionServer can gracefully recover after blocks have been 
moved from under it. Given the LocalityHealer will be an HDFS project 
contribution, I think ideally the DFSClient itself can gracefully recover from 
such an event.

With that in mind, I'm going to try to take a somewhat different approach:
 * HDFS-15119 added a basic invalidation of DFSInputStream cached 
LocatedBlocks. I'm going to expand upon that so that we can safely and reliably 
refresh block locations for DFSInputStreams lacking a local replica: 
https://issues.apache.org/jira/browse/HDFS-16262
 * Additionally, I'm going to try to add a grace period to block invalidations 
in https://issues.apache.org/jira/browse/HDFS-16261. When a block is moved with 
REPLACE_BLOCK, the block is invalidated on the old host and asynchronously 
deleted. Adding a configurable grace period on the deletion where will give the 
above refresh enough time to refresh cached locations and totally skip any pain 
related to moving blocks around.

> Reflect out-of-band locality improvements in served requests
> ------------------------------------------------------------
>
>                 Key: HBASE-26304
>                 URL: https://issues.apache.org/jira/browse/HBASE-26304
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Bryan Beaudreault
>            Assignee: Bryan Beaudreault
>            Priority: Major
>
> Once the LocalityHealer has improved locality of a StoreFile (by moving 
> blocks onto the correct host), the Reader's DFSInputStream and Region's 
> localityIndex metric must be refreshed. Without refreshing the 
> DFSInputStream, the improved locality will not improve latencies. In fact, 
> the DFSInputStream may try to fetch blocks that have moved, resulting in a 
> ReplicaNotFoundException. This is automatically retried, but the retry will 
> increase long tail latencies relative to configured backoff strategy.
> See https://issues.apache.org/jira/browse/HDFS-16155 for an improvement in 
> backoff strategy which can greatly mitigate latency impact of the missing 
> block retry.
> Even with that mitigation, a StoreFile is often made up of many blocks. 
> Without some sort of intervention, we will continue to hit 
> ReplicaNotFoundException over time as clients naturally request data from 
> moved blocks.
> In the original LocalityHealer design, I created a new 
> RefreshHDFSBlockDistribution RPC on the RegionServer. This RPC accepts a list 
> of region names and, for each region store, re-opens the underlying StoreFile 
> if the locality has changed.
> I will submit a PR with that implementation, but I am also investigating 
> other avenues. For example, I noticed 
> https://issues.apache.org/jira/browse/HDFS-15119 which doesn't seem ideal but 
> maybe can be improved as an automatic lower-level handling of block moves.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to