[
https://issues.apache.org/jira/browse/HBASE-17215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15947351#comment-15947351
]
huaxiang sun commented on HBASE-17215:
--------------------------------------
Thanks [~carp84] and [~anoop.hbase]. This is not PCIe-SSD, for some reason
that in archive, there are over 3 million files, cleaner is very slow to go
over these files and disk space is eaten up. If there is a separate large file
delete thread, it will help to free the disk space faster. Going to take a look
at the patch, thanks for the quick response!
> Separate small/large file delete threads in HFileCleaner to accelerate
> archived hfile cleanup speed
> ---------------------------------------------------------------------------------------------------
>
> Key: HBASE-17215
> URL: https://issues.apache.org/jira/browse/HBASE-17215
> Project: HBase
> Issue Type: Improvement
> Reporter: Yu Li
> Assignee: Yu Li
> Attachments: HBASE-17215.patch
>
>
> When using PCIe-SSD the flush speed will be really quick, and although we
> have per CF flush, we still have the
> {{hbase.regionserver.optionalcacheflushinterval}} setting and some other
> mechanism to avoid data kept in memory for too long to flush small hfiles. In
> our online environment we found the single thread cleaner kept cleaning
> earlier flushed small files while large files got no chance, which caused
> disk full then many other problems.
> Deleting hfiles in parallel with too many threads will also increase the
> workload of namenode, so here we propose to separate large/small hfile
> cleaner threads just like we do for compaction, and it turned out to work
> well in our cluster.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)