[jira] [Commented] (CASSANDRA-9669) If sstable flushes complete out of order, on restart we can fail to replay necessary commit log records

Branimir Lambov (JIRA) Fri, 22 Apr 2016 06:46:56 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-9669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15253951#comment-15253951
 ]


Branimir Lambov commented on CASSANDRA-9669:
--------------------------------------------

Rebased patches with a couple of extra tests:

|[2.2|https://github.com/blambov/cassandra/tree/belliottsmith-9669-2.2-rebased-2]|[utest|http://cassci.datastax.com/job/blambov-belliottsmith-9669-2.2-rebased-2-testall/]|[dtest|http://cassci.datastax.com/job/blambov-belliottsmith-9669-2.2-rebased-2-dtest/]|
|[3.0|https://github.com/blambov/cassandra/tree/belliottsmith-9669-3.0-rebased-2]|[utest|http://cassci.datastax.com/job/blambov-belliottsmith-9669-3.0-rebased-2-testall/]|[dtest|http://cassci.datastax.com/job/blambov-belliottsmith-9669-3.0-rebased-2-dtest/]|

The code looks good to me in both, tests are still running.

> If sstable flushes complete out of order, on restart we can fail to replay 
> necessary commit log records
> -------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9669
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9669
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local Write-Read Paths
>            Reporter: Benedict
>            Priority: Critical
>              Labels: correctness
>             Fix For: 2.2.x, 3.0.x, 3.x
>
>
> While {{postFlushExecutor}} ensures it never expires CL entries out-of-order, 
> on restart we simply take the maximum replay position of any sstable on disk, 
> and ignore anything prior. 
> It is quite possible for there to be two flushes triggered for a given table, 
> and for the second to finish first by virtue of containing a much smaller 
> quantity of live data (or perhaps the disk is just under less pressure). If 
> we crash before the first sstable has been written, then on restart the data 
> it would have represented will disappear, since we will not replay the CL 
> records.
> This looks to be a bug present since time immemorial, and also seems pretty 
> serious.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9669) If sstable flushes complete out of order, on restart we can fail to replay necessary commit log records

Reply via email to