[
https://issues.apache.org/jira/browse/HBASE-27495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tak-Lon (Stephen) Wu resolved HBASE-27495.
------------------------------------------
Hadoop Flags: Reviewed
Resolution: Fixed
> Improve HFileLinkCleaner to validate back reference links ahead the next
> traverse
> ----------------------------------------------------------------------------------
>
> Key: HBASE-27495
> URL: https://issues.apache.org/jira/browse/HBASE-27495
> Project: HBase
> Issue Type: Improvement
> Components: master
> Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.5.2
> Reporter: Tak-Lon (Stephen) Wu
> Assignee: Tak-Lon (Stephen) Wu
> Priority: Major
> Fix For: 2.6.0, 3.0.0-alpha-4, 2.5.2
>
>
> We found a a race in the CleanerChore related to back reference links. When
> the HFileLinkCleaner runs for a file it can make 2 decisions depending on the
> file types.
> - HFiles, The cleaner for HFile deletion only checks if the .links-<>
> directory is present with files.
> - Back reference links, the cleaner checks if the forward link is still
> available in the data directory.
> The logic and order how the cleaner checks these 2 files matters. When the
> back reference is checked first it can remove both the reference and the
> HFile from the archive, however, when it first runs for the HFile then only
> the back-reference is removed. In this case, the HFile is only deleted in the
> next iteration of the CleanerChore, and it could be very slow if the list of
> files are huge in case of using object store.
> The goal of this task is to improve traverse of the archived HFile, reusing
> the list of found back reference files, and immediately apply the checks for
> the Back reference links.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)