[jira] [Commented] (CASSANDRA-9749) CommitLogReplayer continues startup after encountering errors

Branimir Lambov (JIRA) Thu, 09 Jul 2015 06:39:25 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14620509#comment-14620509
 ]


Branimir Lambov commented on CASSANDRA-9749:
--------------------------------------------

There are several more things to consider here:
- In a node powered off while flushing section to disk there will be a read or 
decompression error. This will be a very normal, frequently occurring 
situation. Do we stop/die for it?
- Supposing there's bit rot and the operator does want to recover data in the 
other log sections in the same segment, he is now supposed to change commit log 
policy to "ignore" and boot up the cluster with that setting. Do we want to run 
the (quite substantial) risk the operator will not restart it again and the 
node stays in unintended "ignore" commit log policy?
- As mentioned in CASSANDRA-7125, how do we know a table is unknown rather than 
dropped?


> CommitLogReplayer continues startup after encountering errors
> -------------------------------------------------------------
>
>                 Key: CASSANDRA-9749
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9749
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Blake Eggleston
>            Assignee: Branimir Lambov
>             Fix For: 2.2.0 rc2
>
>
> There are a few places where the commit log recovery method either skips 
> sections or just returns when it encounters errors.
> Specifically if it can't read the header here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L298
> Or if there are compressor problems here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L314
>  and here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L366
> Whether these are user-fixable or not, I think we should require more direct 
> user intervention (ie: fix what's wrong, or remove the bad file and restart) 
> since we're basically losing data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9749) CommitLogReplayer continues startup after encountering errors

Reply via email to