[
https://issues.apache.org/jira/browse/HBASE-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16049240#comment-16049240
]
Allan Yang commented on HBASE-18128:
------------------------------------
Have some chat with [~tedyu]. Ignoring the seqID of compaction marker will
resulting all compaction marker in the log being replayed
1. Only keep compaction marker with seqid smaller than flushed, but may drop
the latest compaction marker to replay (very rare case, not a problem since if
happens, the only result is that some redundant files in the region)
2. keep all compaction marker in the hlogs, writing many useless compaction
marker entries to the recovered.edits and replay (not a problem either, since
replay compaction is idempotent)
I'd prefer the first one.
> compaction marker could be skipped
> -----------------------------------
>
> Key: HBASE-18128
> URL: https://issues.apache.org/jira/browse/HBASE-18128
> Project: HBase
> Issue Type: Improvement
> Components: Compaction, regionserver
> Reporter: Jingyun Tian
> Assignee: Jingyun Tian
> Attachments: HBASE-18128-master.patch, HBASE-18128-master-v2.patch,
> HBASE-18128-master-v3.patch, TestCompactionMarker.java
>
>
> The sequence for a compaction are as follows:
> 1. Compaction writes new files under region/.tmp directory (compaction output)
> 2. Compaction atomically moves the temporary file under region directory
> 3. Compaction appends a WAL edit containing the compaction input and output
> files. Forces sync on WAL.
> 4. Compaction deletes the input files from the region directory.
> But if a flush happened between 3 and 4, then the regionserver crushed. The
> compaction marker will be skipped when splitting log because the sequence id
> of compaction marker is smaller than lastFlushedSequenceId.
> {code}
> if (lastFlushedSequenceId >= entry.getKey().getLogSeqNum()) {
> editsSkipped++;
> continue;
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)