Huginn-kio commented on PR #7351:
URL: https://github.com/apache/hbase/pull/7351#issuecomment-3350891172

   In CleanerChore, a thread is assigned to each subdirectory under the archive 
directory to identify deletable files. In determining files that are not 
referenced by snapshots, all threads share a single SnapshotHFileCleaner and 
SnapshotHFileCache. 
   
   Since taking snapshots changes the set of files referenced by snapshots, 
taking snapshots and the SnapshotHFileCleaner are mutually exclusive 
operations. Thus, taking snapshots requires acquiring the 
takingSnapshot.readLock, while the SnapshotHFileCleaner, when selecting 
unreferenced files, must acquire the takingSnapshot.writeLock. Moreover, 
because the SnapshotHFileCache is shared among threads, different threads must 
acquire the takingSnapshot.writeLock before refreshing the cache.
   
   With the current implementation, the coarse granularity of the lock resulted 
in a serialized process for selecting unreferenced files in multi-threaded 
environments, meaning that increasing the number of cleaning threads did not 
significantly improve efficiency. To reduce lock granularity and enhance 
cleaning efficiency, we introduced the following optimizations:
   
   1. Optimistic locking based on version: Since taking snapshot is not a 
high-frequency operation, this allows getting unreferenced file to proceed 
without requiring locks throughout the entire process.
   
   2. Fine-grained locking: When multiple threads concurrently determine 
unreferenced files, locks are only required when updating shared objects — the 
SnapshotHFileCache and the snapshot cache version marker.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to