Xiaolin Ha created HBASE-26342:
----------------------------------

             Summary: cleaner supports custom paths of independent 
configuration and pool
                 Key: HBASE-26342
                 URL: https://issues.apache.org/jira/browse/HBASE-26342
             Project: HBase
          Issue Type: Improvement
          Components: master
    Affects Versions: 2.0.0, 3.0.0-alpha-2
            Reporter: Xiaolin Ha
            Assignee: Xiaolin Ha


With this, we can clean some paths more quickly.

We found in our cluster, when the very huge table with thousands of regions and 
high write throughputs and many snapshots tables on the same cluster, the speed 
of delete files in  archive path will lower than the speed of moved in files by 
compaction. Then archive may remains PB level data. 

The bottleneck is in cleaner but not in the thread pool size or queue size. It 
is because there is synchronized lock in SnapshotFileCache, and a batch of 
files need once SnapshotFileCache#refreshCache(), which look through all the 
snapshot dirs.

The speed of clear a path without the SnapshotHFileCleaner is thirty times 
faster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to