[
https://issues.apache.org/jira/browse/CASSANDRA-9140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14520186#comment-14520186
]
Tyler Hobbs commented on CASSANDRA-9140:
----------------------------------------
The changes you made look pretty good to me, overall. However, some of the
surrounding code in Scrubber seems odd or incorrect. I think we should try to
improve it a bit while we're working around this code:
* We should use separate log messages for an unreadable key and differing
index/data keys. If the keys differ, we should log both keys.
* The warning log for {{dataStart != dataStartFromIndex}} should be moved
before the data read, and we should log both start positions
** A similar check and log message for {{dataSize != dataSizeFromIndex}} would
be good
* Unless I'm missing something, it looks like the retry doesn't actually use
the data size or position from the index. It seems like the intent is to try
to read the data based on the Data component's position and size (if present)
first, and if that fails, use the position and size from the index.
* If {{currentIndexKey}} is null (meaning there was an error reading from the
index), we should just make {{dataStartFromIndex}} and {{dataSizeFromIndex}} -1
to avoid confusing numbers in the log messages
* {{dataStartFromIndex}} should be using the previous value of
{{nextRowPositionFromIndex}} (or 0 if it's the first row). Right now it's
using a combination of index positions and the data file positions.
Let me know what you think about the above suggestions.
> Scrub should handle corrupted compressed chunks
> -----------------------------------------------
>
> Key: CASSANDRA-9140
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9140
> Project: Cassandra
> Issue Type: Improvement
> Components: Tools
> Reporter: Tyler Hobbs
> Assignee: Stefania
> Fix For: 2.1.x, 2.0.x
>
>
> Scrub can handle corruption within a row, but can't handle corruption of a
> compressed sstable that results in being unable to decompress a chunk. Since
> the majority of Cassandra users are probably running with compression
> enabled, it's important that scrub be able to handle this (likely more
> common) form of sstable corruption.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)