[
https://issues.apache.org/jira/browse/HBASE-3242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12970049#action_12970049
]
Nicolas Spiegelberg commented on HBASE-3242:
--------------------------------------------
so, talking about this internally. I wasn't 100% thinking along Stack's lines.
My assumption is that we could support per-CF flushing of MemStore and would
therefore have different seqno per-CF to prune on. This would prevent a
slow-growing CF from being flushed until it reaches a significant size.
another thing to keep in mind:
HFile compaction = read HFiles + merge + write new HFIles
HLog compaction = snapshot MemStore + prune + write aggregate HLog.
so HLog compaction only adds write IO, not read IO. All said, Karthik's
HBASE-3327 suggestion would be much easier to implement in the short term since
HFile compactions would require merging the snapShot MemStore + current
MemStore after compaction has finished.
> HLog Compactions
> ----------------
>
> Key: HBASE-3242
> URL: https://issues.apache.org/jira/browse/HBASE-3242
> Project: HBase
> Issue Type: Improvement
> Components: regionserver
> Reporter: Nicolas Spiegelberg
>
> Currently, our memstore flush algorithm is pretty trivial. We let it grow to
> a flushsize and flush a region or grow to a certain log count and then flush
> everything below a seqid. In certain situations, we can get big wins from
> being more intelligent with our memstore flush algorithm. I suggest we look
> into algorithms to intelligently handle HLog compactions. By compaction, I
> mean replacing existing HLogs with new HLogs created using the contents of a
> memstore snapshot. Situations where we can get huge wins:
> 1. In the incrementColumnValue case, N HLog entries often correspond to a
> single memstore entry. Although we may have large HLog files, our memstore
> could be relatively small.
> 2. If we have a hot region, the majority of the HLog consists of that one
> region and other region edits would be minuscule.
> In both cases, we are forced to flush a bunch of very small stores. Its
> really hard for a compaction algorithm to be efficient when it has no
> guarantees of the approximate size of a new StoreFile, so it currently does
> unconditional, inefficient compactions. Additionally, compactions & flushes
> suck because they invalidate cache entries: be it memstore or LRUcache. If
> we can limit flushes to cases where we will have significant HFile output on
> a per-Store basis, we can get improved performance, stability, and reduced
> failover time.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.