[ 
https://issues.apache.org/jira/browse/CASSANDRA-19786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yang updated CASSANDRA-19786:
----------------------------------
    Summary: Scrubber can't fix corruption on corrupted compressed sstable and 
generate corrupted sstable  (was: Scrubber can't fix corruption on corrupted 
compressed sstable)

> Scrubber can't fix corruption on corrupted compressed sstable and generate 
> corrupted sstable
> --------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-19786
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19786
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/SSTable
>            Reporter: Zhao Yang
>            Priority: Normal
>
> When scrubber detects a corruption, 
> `CompressedSequentilWriter#resetAndTruncate` will be called:
> * `uncompressedSize` and `compressedSize` are not reset. When flushing next 
> chunk, last flush offset will be computed based on wrong size. We should get 
> rid of these variables.
> *  CRC value is computed based on flushed chunks which may have been reset. 
> The final digest value will be wrong. We have to rescan data file at the end 
> of writer to compute proper digest if resetAndTruncate is called.
> Reproduction -[unit 
> test|https://github.com/jasonstack/cassandra/commit/106e9460f7e6e9690aa056def5329174bbca9a47]
>  (not sure why the test didn't work due to some buffer is already released): 
> * populate multiple large partitions
> * corrupt middle of the file
> The point is to keep appending chunks after the corruption point where 
> `resetAndTruncate` is called.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to