[
https://issues.apache.org/jira/browse/CASSANDRA-13282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jeff Jirsa updated CASSANDRA-13282:
-----------------------------------
Description: Following CASSANDRA-9749 , stricter correctness checks on
commitlog replay can incorrectly detect "corrupt segments" and stop commitlog
replay (and potentially stop cassandra, depending on the configured policy). In
{{CommitlogReplayer#replaySyncSection}} we try to read a 4 byte int
{{serializedSize}}, and if it's 0 (which will happen due to zeroing when the
segment was created), we continue on to the next segment. However, it appears
that if a mutation is sized such that it ends with 1, 2, or 3 bytes remaining
in the segment, we'll pass the {{isEOF}} on the while loop but fail to read the
{{serializedSize}} int, and fail. (was: Following CASSANDRA-9749 , stricter
correctness checks on commitlog replay can incorrectly detect "corrupt
segments" and stop commitlog replay (and potentially stop cassandra, depending
on the configured policy). In {{CommitlogReplayer#replaySyncSection}} we try to
read a 4 byte int {{serializedSize}}, and if it's 0 (which will happen due to
zeroing when the segment was created), we continue on to the next segment.
However, it appears that if a mutation is sized such that it ends with 1, 2, or
3 bytes remaining in the segment, we'll hit pass the {{isEOF}} on the while
loop but fail to read the {{serializedSize}} int, and fail. )
> Commitlog replay may fail if last mutation is within 4 bytes of end of segment
> ------------------------------------------------------------------------------
>
> Key: CASSANDRA-13282
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13282
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Reporter: Jeff Jirsa
> Assignee: Jeff Jirsa
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> Following CASSANDRA-9749 , stricter correctness checks on commitlog replay
> can incorrectly detect "corrupt segments" and stop commitlog replay (and
> potentially stop cassandra, depending on the configured policy). In
> {{CommitlogReplayer#replaySyncSection}} we try to read a 4 byte int
> {{serializedSize}}, and if it's 0 (which will happen due to zeroing when the
> segment was created), we continue on to the next segment. However, it appears
> that if a mutation is sized such that it ends with 1, 2, or 3 bytes remaining
> in the segment, we'll pass the {{isEOF}} on the while loop but fail to read
> the {{serializedSize}} int, and fail.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)