[jira] [Commented] (CASSANDRA-9669) If sstable flushes complete out of order, on restart we can fail to replay necessary commit log records

Benedict (JIRA) Tue, 15 Sep 2015 06:42:18 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-9669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14745464#comment-14745464
 ]


Benedict commented on CASSANDRA-9669:
-------------------------------------

bq. Hmm. This is a bit of a problem. 

I remember now why it is not. We don't generally permit downgrades, and when 
upgrading you must upgrade via the latest of any intervening minor version. 

Whether or not this is a problem, I think it's probably better left for another 
ticket, but I think we would be best off fixing it so a given c* version knows 
the maximum sstable version it can read for each prior (and equal) minor 
cassandra version. So long as the main data format is the same (which is the 
case here) there shouldn't be a problem, as we only stream the data contents 
between nodes, so there's no reason for an old version to see a new file. 
However we should probably make that more robust to sstable version changes 
also.

> If sstable flushes complete out of order, on restart we can fail to replay 
> necessary commit log records
> -------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9669
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9669
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: Critical
>              Labels: correctness
>             Fix For: 3.x, 2.1.x, 2.2.x, 3.0.x
>
>
> While {{postFlushExecutor}} ensures it never expires CL entries out-of-order, 
> on restart we simply take the maximum replay position of any sstable on disk, 
> and ignore anything prior. 
> It is quite possible for there to be two flushes triggered for a given table, 
> and for the second to finish first by virtue of containing a much smaller 
> quantity of live data (or perhaps the disk is just under less pressure). If 
> we crash before the first sstable has been written, then on restart the data 
> it would have represented will disappear, since we will not replay the CL 
> records.
> This looks to be a bug present since time immemorial, and also seems pretty 
> serious.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9669) If sstable flushes complete out of order, on restart we can fail to replay necessary commit log records

Reply via email to