[
https://issues.apache.org/jira/browse/HBASE-26342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17530991#comment-17530991
]
Andrew Kyle Purtell commented on HBASE-26342:
---------------------------------------------
Going to roll 2.5.0RC0 soon. Keep this in for imminent resolution or bump it?
Will look at the PR soon...
> cleaner supports custom paths of independent configuration and pool
> -------------------------------------------------------------------
>
> Key: HBASE-26342
> URL: https://issues.apache.org/jira/browse/HBASE-26342
> Project: HBase
> Issue Type: New Feature
> Components: master
> Affects Versions: 1.7.1, 3.0.0-alpha-2, 2.4.10
> Reporter: Xiaolin Ha
> Assignee: Xiaolin Ha
> Priority: Major
> Fix For: 2.5.0, 2.6.0, 3.0.0-alpha-3
>
> Attachments: result.png
>
>
> With this, we can clean some paths more quickly.
> We found in our cluster, when the very huge table with thousands of regions
> and high write throughputs and many snapshots tables on the same cluster, the
> speed of delete files in archive path will lower than the speed of moved in
> files by compaction. Then archive may remains PB level data.
> The bottleneck is in cleaner but not in the thread pool size or queue size.
> It is because there is synchronized lock in SnapshotFileCache, and a batch of
> files need once SnapshotFileCache#refreshCache(), which look through all the
> snapshot dirs.
> The speed of clear a path without the SnapshotHFileCleaner is thirty times
> faster.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)