[jira] [Commented] (CASSANDRA-9947) nodetool verify is broken
[ https://issues.apache.org/jira/browse/CASSANDRA-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15000955#comment-15000955 ] Ariel Weisberg commented on CASSANDRA-9947: --- The fact that we never validate checksums on uncompressed data on reads creates problems for repair even before verify is run. We can propagate corrupted data because the merkle tree is going to detect the corruption and attempt to propagate it without validating the checksum of the corrupt data. Right now scrub isn't going to validate checksums on uncompressed files, based on my reading, so scrubbing won't improve the situation.. I also don't see how scrub can fix a corrupted compressed tables since the checksum is not per record. It's going to be an arbitrary 64k page. You could try and parse the page anyways, but that is not what is currently done since the reader will just throw an exception if you try. Corrupted sstables work fine in the regular path because the index points you do a valid place to start reading from, but that won't work for a sequential walk through the file. It seems to me like we are shuffling deck chairs on the titanic once we allow repair to propagate corrupted data. You could say the same about returning corrupted data to user queries since those can be used to propagate the corruption back into C* at all replicas. If there are flows of handling corruption we want to have it might make sense to create some test cases for the various file formats and see what the existing code actually does. My suspicion is that sequential access is going to fail in the compressed compressed stuff and blindly succeed in uncompressed case. We also need to nail down fix versions since coalescing to something that works might not be possible/worthwhile against existing formats. And while we are at it maybe we should nail down file formats we are more happy with in terms of being flexible about block sizes, implementing a page cache etc. > nodetool verify is broken > - > > Key: CASSANDRA-9947 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9947 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Jonathan Ellis >Priority: Critical > Fix For: 2.2.4 > > > Raised these issues on CASSANDRA-5791, but didn't revert/re-open, so they > were ignored: > We mark sstables that fail verification as unrepaired, but that's not going > to do what you think. What it means is that the local node will use that > sstable in the next repair, but other nodes will not. So all we'll end up > doing is streaming whatever data we can read from it, to the other replicas. > If we could magically mark whatever sstables correspond on the remote nodes, > to the data in the local sstable, that would work, but we can't. > IMO what we should do is: > *scrub, because it's quite likely we'll fail reading from the sstable > otherwise and > *full repair across the data range covered by the sstable > Additionally, > * I'm not sure that keeping "extended verify" code around is worth it. Since > the point is to work around not having a checksum, we could just scrub > instead. This is slightly more heavyweight but it would be a one-time cost > (scrub would build a new checksum) and we wouldn't have to worry about > keeping two versions of almost-the-same-code in sync. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9947) nodetool verify is broken
[ https://issues.apache.org/jira/browse/CASSANDRA-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977114#comment-14977114 ] Jeff Jirsa commented on CASSANDRA-9947: --- The intent behind marking it as unrepaired was to allow other nodes to repair this data inbound, though it does make sense that by doing so, we'll also potentially pollute that corrupt data out to other replicas. Issuing a scrub may not be the right fix, either - a single bit flip in an uncompressed table will scrub just fine, and nothing will happen except you'll write a new checksum and lose knowledge of the fact that the bit-flip happened. Maybe the limitation here is that we have effectively 2 states - repaired and unrepaired - when we need a third - corrupt - so we can force this local node to repair its range, without using that sstable as a source for outgoing repair streams? > nodetool verify is broken > - > > Key: CASSANDRA-9947 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9947 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jonathan Ellis >Priority: Critical > Fix For: 2.2.x > > > Raised these issues on CASSANDRA-5791, but didn't revert/re-open, so they > were ignored: > We mark sstables that fail verification as unrepaired, but that's not going > to do what you think. What it means is that the local node will use that > sstable in the next repair, but other nodes will not. So all we'll end up > doing is streaming whatever data we can read from it, to the other replicas. > If we could magically mark whatever sstables correspond on the remote nodes, > to the data in the local sstable, that would work, but we can't. > IMO what we should do is: > *scrub, because it's quite likely we'll fail reading from the sstable > otherwise and > *full repair across the data range covered by the sstable > Additionally, > * I'm not sure that keeping "extended verify" code around is worth it. Since > the point is to work around not having a checksum, we could just scrub > instead. This is slightly more heavyweight but it would be a one-time cost > (scrub would build a new checksum) and we wouldn't have to worry about > keeping two versions of almost-the-same-code in sync. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9947) nodetool verify is broken
[ https://issues.apache.org/jira/browse/CASSANDRA-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649199#comment-14649199 ] Jonathan Ellis commented on CASSANDRA-9947: --- IMO we should disable verify for 2.2.1 until we can rearchitect it since this is a nontrivial change. nodetool verify is broken - Key: CASSANDRA-9947 URL: https://issues.apache.org/jira/browse/CASSANDRA-9947 Project: Cassandra Issue Type: Bug Components: Core Reporter: Jonathan Ellis Priority: Critical Fix For: 2.2.x Raised these issues on CASSANDRA-5791, but didn't revert/re-open, so they were ignored: We mark sstables that fail verification as unrepaired, but that's not going to do what you think. What it means is that the local node will use that sstable in the next repair, but other nodes will not. So all we'll end up doing is streaming whatever data we can read from it, to the other replicas. If we could magically mark whatever sstables correspond on the remote nodes, to the data in the local sstable, that would work, but we can't. IMO what we should do is: *scrub, because it's quite likely we'll fail reading from the sstable otherwise and *full repair across the data range covered by the sstable Additionally, * I'm not sure that keeping extended verify code around is worth it. Since the point is to work around not having a checksum, we could just scrub instead. This is slightly more heavyweight but it would be a one-time cost (scrub would build a new checksum) and we wouldn't have to worry about keeping two versions of almost-the-same-code in sync. -- This message was sent by Atlassian JIRA (v6.3.4#6332)