[
https://issues.apache.org/jira/browse/HBASE-17407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15799411#comment-15799411
]
stack commented on HBASE-17407:
-------------------------------
[~Apache9] has a point. The jiggering w/ sequenceid accounting just before
flush finishes adds extra machinary compounding an already complex topic. It is
hard to reason about cleanly (I was reminded of this just trying to get back up
to the level at which this discussion is happening). Getting the accounting
wrong means dataloss and a bunch of time lost debugging.
I see where you are coming from [~eshcar] with your leaning on existing
guarantees and trying to preserve the default code path as much as possible. Do
you think it possible to simplify accounting of this new entity, the set of
edits that remain in the pipeline after a flush? What you think of the
[~Apache9] idea of removing the edit of sequenceids (and the discomforting
reset of lowest sequenceid dependent on onlyIfGreater)? I like your attempt at
adjusting lowest sequence id if an inmemory compaction ends up dropping the
minimum edits but if it means we can simplify sequence id accounting, lets just
let over-replay happen on crash.
I'd like to help. I've been trying to write a spec on sequenceid accounting w/
a while now but I can't finish because it keeps changing (smile).
> Correct update of maxFlushedSeqId in HRegion
> --------------------------------------------
>
> Key: HBASE-17407
> URL: https://issues.apache.org/jira/browse/HBASE-17407
> Project: HBase
> Issue Type: Bug
> Reporter: Eshcar Hillel
>
> The attribute maxFlushedSeqId in HRegion is used to track the max sequence id
> in the store files and is reported to HMaster. When flushing only part of the
> memstore content this value might be incorrect and may cause data loss.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)