[ https://issues.apache.org/jira/browse/HBASE-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216608#comment-13216608 ]
stack commented on HBASE-5479: ------------------------------ Todd suggests something like a scoring over here Matt: https://issues.apache.org/jira/browse/HBASE-2457?focusedCommentId=12857705&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12857705 Lets verify that we do indeed do selection at queuing time. Thats my suspicion. If thats the case, for sure needs fixing. Thanks for filing this one Matt. > Postpone CompactionSelection to compaction execution time > --------------------------------------------------------- > > Key: HBASE-5479 > URL: https://issues.apache.org/jira/browse/HBASE-5479 > Project: HBase > Issue Type: New Feature > Components: io, performance, regionserver > Reporter: Matt Corgan > > It can be commonplace for regionservers to develop long compaction queues, > meaning a CompactionRequest may execute hours after it was created. The > CompactionRequest holds a CompactionSelection that was selected at request > time but may no longer be the optimal selection. The CompactionSelection > should be created at compaction execution time rather than compaction request > time. > The current mechanism breaks down during high volume insertion. The > inefficiency is clearest when the inserts are finished. Inserting for 5 > hours may build up 50 storefiles and a 40 element compaction queue. When > finished inserting, you would prefer that the next compaction merges all 50 > files (or some large subset), but the current system will churn through each > of the 40 compaction requests, the first of which may be hours old. This > ends up re-compacting the same data many times. > The current system is especially inefficient when dealing with time series > data where the data in the storefiles has minimal overlap. With time series > data, there is even less benefit to intermediate merges because most > storefiles can be eliminated based on their key range during a read, even > without bloomfilters. The only goal should be to reduce file count, not to > minimize number of files merged for each read. > There are other aspects to the current queuing mechanism that would need to > be looked at. You would want to avoid having the same Store in the queue > multiple times. And you would want the completion of one compaction to > possibly queue another compaction request for the store. > A alternative architecture to the current style of queues would be to have > each Store (all open in memory) keep a compactionPriority score up to date > after events like flushes, compactions, schema changes, etc. Then you create > a "CompactionPriorityComparator implements Comparator<Store>" and stick all > the Stores into a PriorityQueue (synchronized remove/add from the queue when > the value changes). The async compaction threads would keep pulling off the > head of that queue as long as the head has compactionPriority > X. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira