[
https://issues.apache.org/jira/browse/HBASE-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16036641#comment-16036641
]
Jingyun Tian commented on HBASE-18128:
--------------------------------------
We replay the compaction marker only for deleting the files we should've done
after compaction.
If the compaction input files was already deleted before, replaying of
compaction marker did nothing;
But if the compaction input files wasn't deleted, we need to replay this
compaction marker to delete them.
So this operation is idempotent, that's why I think we can don't consider the
seqId.
For your question, sorry I don't understand what do you mean by clubbed
together with normal cell(s).
The cells from same region are stored in the same recovered edits file. I
think they are clubbed together.
The difference is how do we replay them when we go through the file.
> compaction marker could be skipped
> -----------------------------------
>
> Key: HBASE-18128
> URL: https://issues.apache.org/jira/browse/HBASE-18128
> Project: HBase
> Issue Type: Improvement
> Components: Compaction, regionserver
> Reporter: Jingyun Tian
> Assignee: Jingyun Tian
> Attachments: HBASE-18128.patch
>
>
> The sequence for a compaction are as follows:
> 1. Compaction writes new files under region/.tmp directory (compaction output)
> 2. Compaction atomically moves the temporary file under region directory
> 3. Compaction appends a WAL edit containing the compaction input and output
> files. Forces sync on WAL.
> 4. Compaction deletes the input files from the region directory.
> But if a flush happened between 3 and 4, then the regionserver crushed. The
> compaction marker will be skipped when splitting log because the sequence id
> of compaction marker is smaller than lastFlushedSequenceId.
> {code}
> if (lastFlushedSequenceId >= entry.getKey().getLogSeqNum()) {
> editsSkipped++;
> continue;
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)