[
https://issues.apache.org/jira/browse/HBASE-17407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801513#comment-15801513
]
stack commented on HBASE-17407:
-------------------------------
bq. (but for some reason the comment was deleted).
I intentionally deleted the comment because I felt it added little benefit to
the back-and-forth here.
bq. I think it was important to understand that in the current state there is
no danger of data loss.
You mean with the finalizeFlush/updateStore calls in place and NO inmemory
compaction -- just BASIC mode where we flush all in the pipeline?
If the above, I think so. That said, the finalizeFlush/updateStore calls are
new moving pieces and this corner cases are hard to manufacture.
bq. Code maintainability is also important.
Yes. This sequenceid accounting is unfortunately involved and tough to test.
bq. I can replace finalizeFlush with a preFlushSeqIDEstimation() which returns
a lower bound on the sequence id that is invoked before we start the flush.
You think this will restore our sequence id accounting to what it was before
finalizeFlush/updateStore ? How will we deal with the gap between the new
edits coming in filling lowestUnflushedSequenceIds after we have swapped it
out to do the current and the edits in the pipeline that did not get flushed
during the current flush session?
bq. You say WAL truncation cannot be triggered during a flush.
Indeed. See how closeBarrier is used in AbstractFSWAL
bq. Can the map in seq accounting be reported to master during a flush?
See HRegion#setCompleteSequenceId where we build our sequenceid to send to the
master. See how it asks the WAL subsystem for earliest edit by column family:
long earliest = this.wal.getEarliestMemstoreSeqNum(encodedRegionName,
familyName);
Here is the implementation:
{code}
@Override
public long getEarliestMemstoreSeqNum(byte[] encodedRegionName, byte[]
familyName) {
// This method is used by tests and for figuring if we should flush or not
because our
// sequenceids are too old. It is also used reporting the master our oldest
sequenceid for use
// figuring what edits can be skipped during log recovery.
getEarliestMemStoreSequenceId
// from this.sequenceIdAccounting is looking first in
flushingOldestStoreSequenceIds, the
// currently flushing sequence ids, and if anything found there, it is
returning these. This is
// the right thing to do for the reporting oldest sequenceids to master; we
won't skip edits if
// we crash during the flush. For figuring what to flush, we might get
requeued if our sequence
// id is old even though we are currently flushing. This may mean we do too
much flushing.
return this.sequenceIdAccounting.getLowestSequenceId(encodedRegionName,
familyName);
}
{code}
It tries to explain how it works.
That it returns flushingSequenceIds and then lowestUnflushedSequenceIds if
former is not present may be what [~Apache9] is referring to in the 'not report
the value if a flush ongoing' (I did not see a block on reporting during
'flush' -- maybe I'm looking in wrong place).
Thanks.
> Correct update of maxFlushedSeqId in HRegion
> --------------------------------------------
>
> Key: HBASE-17407
> URL: https://issues.apache.org/jira/browse/HBASE-17407
> Project: HBase
> Issue Type: Bug
> Reporter: Eshcar Hillel
>
> The attribute maxFlushedSeqId in HRegion is used to track the max sequence id
> in the store files and is reported to HMaster. When flushing only part of the
> memstore content this value might be incorrect and may cause data loss.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)