[
https://issues.apache.org/jira/browse/CASSANDRA-15368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16972149#comment-16972149
]
Branimir Lambov commented on CASSANDRA-15368:
---------------------------------------------
Does this mean that this issue is only an artifact of the fix to
CASSANDRA-15367?
> Failing to flush Memtable without terminating process results in permanent
> data loss
> ------------------------------------------------------------------------------------
>
> Key: CASSANDRA-15368
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15368
> Project: Cassandra
> Issue Type: Bug
> Components: Local/Commit Log, Local/Memtable
> Reporter: Benedict Elliott Smith
> Priority: Normal
> Fix For: 4.0, 2.2.x, 3.0.x, 3.11.x
>
>
> {{Memtable}} do not contain records that cover a precise contiguous range of
> {{ReplayPosition}}, since there are only weak ordering constraints when
> rolling over to a new {{Memtable}} - the last operations for the old
> {{Memtable}} may obtain their {{ReplayPosition}} after the first operations
> for the new {{Memtable}}.
> Unfortunately, we treat the {{Memtable}} range as contiguous, and invalidate
> the entire range on flush. Ordinarily we only invalidate records when all
> prior {{Memtable}} have also successfully flushed. However, in the event of
> a flush that does not terminate the process (either because of disk failure
> policy, or because it is a software error), the later flush is able to
> invalidate the region of the commit log that includes records that should
> have been flushed in the prior {{Memtable}}
> More problematically, this can also occur on restart without any associated
> flush failure, as we use commit log boundaries written to our flushed
> sstables to filter {{ReplayPosition}} on recovery, which is meant to
> replicate our {{Memtable}} flush behaviour above. However, we do not know
> that earlier flushes have completed, and they may complete successfully
> out-of-order. So any flush that completes before the process terminates, but
> began after another flush that _doesn’t_ complete before the process
> terminates, has the potential to cause permanent data loss.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]