Tak-Lon (Stephen) Wu created HBASE-27495:
--------------------------------------------
Summary: Improve HFileLinkCleaner to validate back reference links
ahead the next traverse
Key: HBASE-27495
URL: https://issues.apache.org/jira/browse/HBASE-27495
Project: HBase
Issue Type: Improvement
Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.5.2
Reporter: Tak-Lon (Stephen) Wu
Assignee: Tak-Lon (Stephen) Wu
We found a a race in the CleanerChore related to back reference links. When the
HFileLinkCleaner runs for a file it can make 2 decisions depending on the file
types.
- Hfiles, The cleaner for HFile deletion only checks if the .links-<>
directory is present with files.
- Back reference links, the cleaner checks if the forward link is still
available in the data directory.
The logic and order how the cleaner checks these 2 files matters. When the back
reference is checked first it can remove both the reference and the HFile from
the archive, however, when it first runs for the HFile then only the
back-reference is removed. In this case, the HFile is only deleted in the next
iteration of the CleanerChore, and it could be very slow if the list of files
are huge in case of using object store.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)