[
https://issues.apache.org/jira/browse/HBASE-27590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peter Somogyi reopened HBASE-27590:
-----------------------------------
> Change Iterable to List in SnapshotFileCache
> --------------------------------------------
>
> Key: HBASE-27590
> URL: https://issues.apache.org/jira/browse/HBASE-27590
> Project: HBase
> Issue Type: Improvement
> Reporter: Peter Somogyi
> Assignee: Peter Somogyi
> Priority: Minor
> Fix For: 2.6.0, 3.0.0-alpha-4, 2.5.4
>
> Attachments: flame-1.html
>
>
> The HFileCleaners can have low performance on large /archive area when used
> with slow storage like S3. The snapshot write lock in SnapshotFileCache is
> held while the file metadata is fetched from S3. Due to this even with
> multiple cleaner threads only a single cleaner can effectively delete files
> from the archive.
> File metadata collection is performed before SnapshotHFileCleaner just by
> changing the passed parameter type in FileCleanerDelegate from Iterable to
> List.
> Running with the below cleaner configurations I observed that the lock held
> in SnapshotFileCache went down from 45000ms to 100msĀ when it was running for
> 1000 files in a directory. The complete evaluation and deletion for this
> folder took the same time but since the file metadata fetch from S3 was done
> outside of the lock the multiple cleaner threads were able to run
> concurrently.
> {noformat}
> hbase.cleaner.directory.sorting=false
> hbase.cleaner.scan.dir.concurrent.size=0.75
> hbase.regionserver.hfilecleaner.small.thread.count=16
> hbase.regionserver.hfilecleaner.large.thread.count=8
> {noformat}
> The files to evaluate are already passed in a List to
> CleanerChore.checkAndDeleteFiles but it is converted to an Iterable to run
> the checks on the configured cleaners.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)