[
https://issues.apache.org/jira/browse/CASSANDRA-19786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhao Yang updated CASSANDRA-19786:
----------------------------------
Summary: Scrubber can't fix corruption on corrupted compressed sstable and
generate corrupted sstable (was: Scrubber can't fix corruption on corrupted
compressed sstable)
> Scrubber can't fix corruption on corrupted compressed sstable and generate
> corrupted sstable
> --------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-19786
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19786
> Project: Cassandra
> Issue Type: Bug
> Components: Local/SSTable
> Reporter: Zhao Yang
> Priority: Normal
>
> When scrubber detects a corruption,
> `CompressedSequentilWriter#resetAndTruncate` will be called:
> * `uncompressedSize` and `compressedSize` are not reset. When flushing next
> chunk, last flush offset will be computed based on wrong size. We should get
> rid of these variables.
> * CRC value is computed based on flushed chunks which may have been reset.
> The final digest value will be wrong. We have to rescan data file at the end
> of writer to compute proper digest if resetAndTruncate is called.
> Reproduction -[unit
> test|https://github.com/jasonstack/cassandra/commit/106e9460f7e6e9690aa056def5329174bbca9a47]
> (not sure why the test didn't work due to some buffer is already released):
> * populate multiple large partitions
> * corrupt middle of the file
> The point is to keep appending chunks after the corruption point where
> `resetAndTruncate` is called.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]