[ https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191091#comment-14191091 ]
zhangduo commented on HBASE-10201: ---------------------------------- {quote} Sequenceids are region scoped. If we flush by Store, will there be holes in our accounting? I write sequenceid 1 to A, sequenceid 2 to B, and sequenceid 3 to C. I then write sequence 4 to A. The edit at sequenceid 4 is big and pushes us over and brings on a flush. We flush A and edits 1 and 4. Is the fact that edits 2 and 3 are still up in memory going to mess us up.... Say the server crashes, at replay time we see we flushed up to edit 4, will we think that we edits 2 and 3 persisted? If you don't have an answer, I can work on the answer. {quote} Yes, we write flush seqId 1(Oh I made a mistake, I write seqId 2 in this case, "flushSeqId = oldestSeqIdInStoresNotToFlush" should be "flushSeqId = oldestSeqIdInStoresNotToFlush - 1", I will fix it) in this case, so there will be holes and some WAL replay is unnecessary when doing recovery. We need to store a map of seqId per store instead of a single seqId to solve this, and also need some efforts on log truncation and log replay. {quote} Has this patch been deployed somewhere in production (smile?). If so, would be good to know. In production, it helps? {quote} For me, no. I am using 0.98.6.1 with HBASE-12078 patched right now(so I first try to port it to 0.98 in this issue...). Some test result is posted above. And in our production, I always see log like this {quote} 2014-09-29 13:16:25,061 INFO [MemStoreFlusher.0] regionserver.HRegion: Started memstore flush for sync:Snapshot,\x00\x00\x00\x00\x02$\x0CC,1411782012686.50aba6be7ff3150be983cb6fd77fc686., current region memstore size 128.3 M 2014-09-29 13:16:25,121 INFO [MemStoreFlusher.0] regionserver.DefaultStoreFlusher: Flushed, sequenceid=10932563, memsize=265.7 K, hasBloomFilter=true, into tmp file hdfs://online-hbase/hbase/data/sync/Snapshot/50aba6be7ff315 0be983cb6fd77fc686/.tmp/129e5ef69d7449fea9c2357aa6c4340a 2014-09-29 13:16:25,192 INFO [MemStoreFlusher.0] regionserver.DefaultStoreFlusher: Flushed, sequenceid=10932563, memsize=2.2 M, hasBloomFilter=true, into tmp file hdfs://online-hbase/hbase/data/sync/Snapshot/50aba6be7ff3150b e983cb6fd77fc686/.tmp/316fee39423142e09cdb767de9f9bc5d 2014-09-29 13:16:25,528 INFO [MemStoreFlusher.0] regionserver.DefaultStoreFlusher: Flushed, sequenceid=10932563, memsize=27.9 M, hasBloomFilter=true, into tmp file hdfs://online-hbase/hbase/data/sync/Snapshot/50aba6be7ff3150 be983cb6fd77fc686/.tmp/a886c1e39565468fbf93be6c434f5fc5 2014-09-29 13:16:26,190 INFO [MemStoreFlusher.0] regionserver.DefaultStoreFlusher: Flushed, sequenceid=10932563, memsize=98.0 M, hasBloomFilter=true, into tmp file hdfs://online-hbase/hbase/data/sync/Snapshot/50aba6be7ff3150 be983cb6fd77fc686/.tmp/ec722497c6e14d0fa732c2a9d29e3391 {quote} The smallest store is always flushed with only KBs. That's the reason why I found this issue and started to working on it... {quote} Can you do this for the accounting fixup so by-Store in HLog. {quote} Yes, I can open another issue to work on this. Thanks. > Port 'Make flush decisions per column family' to trunk > ------------------------------------------------------ > > Key: HBASE-10201 > URL: https://issues.apache.org/jira/browse/HBASE-10201 > Project: HBase > Issue Type: Improvement > Components: wal > Reporter: Ted Yu > Assignee: zhangduo > Priority: Critical > Fix For: 2.0.0, 0.99.2 > > Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, > HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, > HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_2.patch, > HBASE-10201_3.patch > > > Currently the flush decision is made using the aggregate size of all column > families. When large and small column families co-exist, this causes many > small flushes of the smaller CF. We need to make per-CF flush decisions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)