[
https://issues.apache.org/jira/browse/HBASE-24619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Guanghao Zhang updated HBASE-24619:
-----------------------------------
Parent: HBASE-23634
Issue Type: Sub-task (was: Improvement)
> Try compact the recovered hfiles firstly after region online
> ------------------------------------------------------------
>
> Key: HBASE-24619
> URL: https://issues.apache.org/jira/browse/HBASE-24619
> Project: HBase
> Issue Type: Sub-task
> Affects Versions: 2.3.0
> Reporter: Guanghao Zhang
> Assignee: Guanghao Zhang
> Priority: Major
>
> As discussed in HBASE-23739 and in HBASE-24632, there may have many recovered
> hfiles. Should find a better way to compact them firstly after region online.
>
> For instance (quoting our [~anoop.hbase]):
> "Assume there were some small files because of flush but never got compacted
> before the RS down happened. We will look for the possible candidate from
> oldest files and in all chance the very old files would get excluded because
> of the size math. But It is possible that new flushed files would get
> selected. And we have the max files to compact config also which is 10 by
> default. Even these small files count alone might be >10. If there are say 15
> WAL files to split, for sure we will have at least 15 small HFiles.
> My thinking was this. After the region open, we have to make sure these small
> files are compacted in one go and we should not even consider the max files
> limit for this compaction. Also to note that this files might not even have
> the DBE/compression etc being applied. Ya coding wise am not sure how clean
> it might come."
>
> And from our [~pankaj2461]
>
> "...concern is the compaction after region open, which impact MTTR due to
> heavy IO in large cluster with many outstanding WALs"
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)