Xiaolin Ha created HBASE-26342:
----------------------------------
Summary: cleaner supports custom paths of independent
configuration and pool
Key: HBASE-26342
URL: https://issues.apache.org/jira/browse/HBASE-26342
Project: HBase
Issue Type: Improvement
Components: master
Affects Versions: 2.0.0, 3.0.0-alpha-2
Reporter: Xiaolin Ha
Assignee: Xiaolin Ha
With this, we can clean some paths more quickly.
We found in our cluster, when the very huge table with thousands of regions and
high write throughputs and many snapshots tables on the same cluster, the speed
of delete files in archive path will lower than the speed of moved in files by
compaction. Then archive may remains PB level data.
The bottleneck is in cleaner but not in the thread pool size or queue size. It
is because there is synchronized lock in SnapshotFileCache, and a batch of
files need once SnapshotFileCache#refreshCache(), which look through all the
snapshot dirs.
The speed of clear a path without the SnapshotHFileCleaner is thirty times
faster.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)