[
https://issues.apache.org/jira/browse/CASSANDRA-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14701021#comment-14701021
]
Branimir Lambov commented on CASSANDRA-9749:
--------------------------------------------
Let me try to rephrase and simplify Jonathan's question:
The original idea of this ticket was to let the commit log failure policy
dictate what is done in the case of replay failures. Now that CASSANDRA-8515 is
in place, this means that regardless of the failure policy the node will always
refuse to start in the case of replay failures (unless they are in the last
section of the last segment). This can be overridded using the
{{cassandra.commitlog.ignorereplayerrors}} flag.
The question now is whether this is a good or sufficient solution, or should I
invest time to override the CASSANDRA-8515 override so that the original
failure policy can apply?
My personal opinion is that it _is_ a proper solution. Any scenario I could
imagine benefiting from e.g. the 'stop' policy at log replay would also be
better served by the same policy at log startup.
> CommitLogReplayer continues startup after encountering errors
> -------------------------------------------------------------
>
> Key: CASSANDRA-9749
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9749
> Project: Cassandra
> Issue Type: Bug
> Reporter: Blake Eggleston
> Assignee: Branimir Lambov
> Fix For: 2.2.x
>
> Attachments: 9749-coverage.tgz
>
>
> There are a few places where the commit log recovery method either skips
> sections or just returns when it encounters errors.
> Specifically if it can't read the header here:
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L298
> Or if there are compressor problems here:
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L314
> and here:
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L366
> Whether these are user-fixable or not, I think we should require more direct
> user intervention (ie: fix what's wrong, or remove the bad file and restart)
> since we're basically losing data.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)