[ 
https://issues.apache.org/jira/browse/HBASE-3242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12970038#action_12970038
 ] 

stack commented on HBASE-3242:
------------------------------

What about doing the following (I thought this was what you were describing 
Nicolas but it seems like you are describing something else).  When we hit N 
HLogs, we currently flush those MemStores that hold the oldest edits.  What if 
in stead, we ran through the oldest M HLogs and rewrote them all into one new 
HLog file dropping edits that have made it into StoreFiles: i.e. those whose 
seqid is < what is the oldest seqid on the regionserver.  We'd then atomically 
swap out the N old HLogs, moving aside for replication or snapshotting or 
whatever but the HRS would now replace the N HLogs w/ the newly written HLog in 
its HLog accounting.  This would not be a compaction, more a cleaning process.

> HLog Compactions
> ----------------
>
>                 Key: HBASE-3242
>                 URL: https://issues.apache.org/jira/browse/HBASE-3242
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Nicolas Spiegelberg
>
> Currently, our memstore flush algorithm is pretty trivial.  We let it grow to 
> a flushsize and flush a region or grow to a certain log count and then flush 
> everything below a seqid.  In certain situations, we can get big wins from 
> being more intelligent with our memstore flush algorithm.  I suggest we look 
> into algorithms to intelligently handle HLog compactions.  By compaction, I 
> mean replacing existing HLogs with new HLogs created using the contents of a 
> memstore snapshot.  Situations where we can get huge wins:
> 1. In the incrementColumnValue case,  N HLog entries often correspond to a 
> single memstore entry.  Although we may have large HLog files, our memstore 
> could be relatively small.
> 2. If we have a hot region, the majority of the HLog consists of that one 
> region and other region edits would be minuscule.
> In both cases, we are forced to flush a bunch of very small stores.  Its 
> really hard for a compaction algorithm to be efficient when it has no 
> guarantees of the approximate size of a new StoreFile, so it currently does 
> unconditional, inefficient compactions.  Additionally, compactions & flushes 
> suck because they invalidate cache entries: be it memstore or LRUcache.  If 
> we can limit flushes to cases where we will have significant HFile output on 
> a per-Store basis, we can get improved performance, stability, and reduced 
> failover time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to