[ 
https://issues.apache.org/jira/browse/HBASE-3242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932772#action_12932772
 ] 

Nicolas Spiegelberg commented on HBASE-3242:
--------------------------------------------

IRC communication below.  Highlights:

1. ICV case is highest pri for a bunch of use cases.  It would be simpler to 
implement.  We could just address that first
2. Make sure we properly handle replication with any changes
3. 'compacting flush' capability would be another plus while we're digging into 
compaction code
------------------------------------------------
[5:21pm] tlipcon: hlog compaction? 
[5:22pm] nspiegelberg:   it's an idea that's been floating around in my head 
the past couple weeks.
[5:22pm] tlipcon: makes some sense
[5:22pm] nspiegelberg: the BigTable paper actually mentions that it compacts 
logs. that got me started thinking about the idea
[5:22pm] tlipcon: just gonna be tricky... all these heuristics
[5:22pm] dj_ryan: being fully persisistent and high speed are hard to get
[5:23pm] dj_ryan: i was thinking it might be possible to improve the speed of 
log reply
[5:23pm] dj_ryan: in which case we could have more outstanding logs
[5:23pm] nspiegelberg: well I think the purpose is to efficiently address edge 
cases
[5:23pm] tlipcon: right, log replay and splitting are both kind of slow
[5:23pm] nspiegelberg: if log entries were uniformly distributed, what we have 
now is perfect
[5:23pm] tlipcon: nspiegelberg: I wonder how much the "compacting flush" would 
buy us
[5:24pm] tlipcon: what BT calls minor compactions
[5:24pm] nspiegelberg: that's another idea that jgray advocates
[5:25pm] nspiegelberg: really, we can implement a trivial HLog compaction that 
is only usefull for ICV applications and it would be greatly beneficial for us
[5:33pm] apurtell: "a trivial HLog compaction that is only usefull for ICV 
applications" -- seems a good start, we'd find that useful and so would stumble 
i believe 
[5:35pm] nspiegelberg: we already have practical need both cases for HLog 
compaction, but the ICV application is definitely higher priority.
[5:38pm] nspiegelberg: yeah, the only thing I haven't researched is replication 
impact.  I imagine that we could handle HLog compactions independently on each 
cluster.  Then, flag the compacted HLogs and just not send them to the replica 
cluster.
[5:39pm] dj_ryan: we might want some jd input
[5:39pm] dj_ryan: but basically there is a 'read point' in a hlog for the 
replication sender
[5:39pm] dj_ryan: so if you are compacting stuff that was already sent, we'll 
be ok
[5:39pm] dj_ryan: and at the target, they'd do similiar things i guess
[5:40pm] nspiegelberg: definitely.  RFC.  I have migration woes right now, but 
I wanted to get the idea out there and have it running through ppls heads 


> HLog Compactions
> ----------------
>
>                 Key: HBASE-3242
>                 URL: https://issues.apache.org/jira/browse/HBASE-3242
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Nicolas Spiegelberg
>
> Currently, our memstore flush algorithm is pretty trivial.  We let it grow to 
> a flushsize and flush a region or grow to a certain log count and then flush 
> everything below a seqid.  In certain situations, we can get big wins from 
> being more intelligent with our memstore flush algorithm.  I suggest we look 
> into algorithms to intelligently handle HLog compactions.  By compaction, I 
> mean replacing existing HLogs with new HLogs created using the contents of a 
> memstore snapshot.  Situations where we can get huge wins:
> 1. In the incrementColumnValue case,  N HLog entries often correspond to a 
> single memstore entry.  Although we may have large HLog files, our memstore 
> could be relatively small.
> 2. If we have a hot region, the majority of the HLog consists of that one 
> region and other region edits would be minuscule.
> In both cases, we are forced to flush a bunch of very small stores.  Its 
> really hard for a compaction algorithm to be efficient when it has no 
> guarantees of the approximate size of a new StoreFile, so it currently does 
> unconditional, inefficient compactions.  Additionally, compactions & flushes 
> suck because they invalidate cache entries: be it memstore or LRUcache.  If 
> we can limit flushes to cases where we will have significant HFile output on 
> a per-Store basis, we can get improved performance, stability, and reduced 
> failover time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to