[jira] [Commented] (CASSANDRA-9947) nodetool verify is broken

2015-11-11 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15000955#comment-15000955
 ] 

Ariel Weisberg commented on CASSANDRA-9947:
---

The fact that we never validate checksums on uncompressed data on reads creates 
problems for repair even before verify is run. We can propagate corrupted data 
because the merkle tree is going to detect the corruption and attempt to 
propagate it without validating the checksum of the corrupt data.

Right now scrub isn't going to validate checksums on uncompressed files, based 
on my reading, so scrubbing won't improve the situation.. I also don't see how 
scrub can fix a corrupted compressed tables since the checksum is not per 
record. It's going to be an arbitrary 64k page. You could try and parse the 
page anyways, but that is not what is currently done since the reader will just 
throw an exception if you try. Corrupted sstables work fine in the regular path 
because the index points you do a valid place to start reading from, but that 
won't work for a sequential walk through the file.

It seems to me like we are shuffling deck chairs on the titanic once we allow 
repair to propagate corrupted data. You could say the same about returning 
corrupted data to user queries since those can be used to propagate the 
corruption back into C* at all replicas.

If there are flows of handling corruption we want to have it might make sense 
to create some test cases for the various file formats and see what the 
existing code actually does. My suspicion is that sequential access is going to 
fail in the compressed compressed stuff and blindly succeed in uncompressed 
case. 

We also need to nail down fix versions since coalescing to something that works 
might not be possible/worthwhile against existing formats. And while we are at 
it maybe we should nail down file formats we are more happy with in terms of 
being flexible about block sizes, implementing a page cache etc.

> nodetool verify is broken
> -
>
> Key: CASSANDRA-9947
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9947
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Jonathan Ellis
>Priority: Critical
> Fix For: 2.2.4
>
>
> Raised these issues on CASSANDRA-5791, but didn't revert/re-open, so they 
> were ignored:
> We mark sstables that fail verification as unrepaired, but that's not going 
> to do what you think.  What it means is that the local node will use that 
> sstable in the next repair, but other nodes will not. So all we'll end up 
> doing is streaming whatever data we can read from it, to the other replicas.  
> If we could magically mark whatever sstables correspond on the remote nodes, 
> to the data in the local sstable, that would work, but we can't.
> IMO what we should do is:
> *scrub, because it's quite likely we'll fail reading from the sstable 
> otherwise and
> *full repair across the data range covered by the sstable
> Additionally,
> * I'm not sure that keeping "extended verify" code around is worth it. Since 
> the point is to work around not having a checksum, we could just scrub 
> instead. This is slightly more heavyweight but it would be a one-time cost 
> (scrub would build a new checksum) and we wouldn't have to worry about 
> keeping two versions of almost-the-same-code in sync.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9947) nodetool verify is broken

2015-10-27 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977114#comment-14977114
 ] 

Jeff Jirsa commented on CASSANDRA-9947:
---

The intent behind marking it as unrepaired was to allow other nodes to repair 
this data inbound, though it does make sense that by doing so, we'll also 
potentially pollute that corrupt data out to other replicas. Issuing a scrub 
may not be the right fix, either - a single bit flip in an uncompressed table 
will scrub just fine, and nothing will happen except you'll write a new 
checksum and lose knowledge of the fact that the bit-flip happened. 

Maybe the limitation here is that we have effectively 2 states - repaired and 
unrepaired - when we need a third - corrupt - so we can force this local node 
to repair its range, without using that sstable as a source for outgoing repair 
streams? 

> nodetool verify is broken
> -
>
> Key: CASSANDRA-9947
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9947
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jonathan Ellis
>Priority: Critical
> Fix For: 2.2.x
>
>
> Raised these issues on CASSANDRA-5791, but didn't revert/re-open, so they 
> were ignored:
> We mark sstables that fail verification as unrepaired, but that's not going 
> to do what you think.  What it means is that the local node will use that 
> sstable in the next repair, but other nodes will not. So all we'll end up 
> doing is streaming whatever data we can read from it, to the other replicas.  
> If we could magically mark whatever sstables correspond on the remote nodes, 
> to the data in the local sstable, that would work, but we can't.
> IMO what we should do is:
> *scrub, because it's quite likely we'll fail reading from the sstable 
> otherwise and
> *full repair across the data range covered by the sstable
> Additionally,
> * I'm not sure that keeping "extended verify" code around is worth it. Since 
> the point is to work around not having a checksum, we could just scrub 
> instead. This is slightly more heavyweight but it would be a one-time cost 
> (scrub would build a new checksum) and we wouldn't have to worry about 
> keeping two versions of almost-the-same-code in sync.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9947) nodetool verify is broken

2015-07-31 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649199#comment-14649199
 ] 

Jonathan Ellis commented on CASSANDRA-9947:
---

IMO we should disable verify for 2.2.1 until we can rearchitect it since this 
is a nontrivial change.

 nodetool verify is broken
 -

 Key: CASSANDRA-9947
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9947
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Priority: Critical
 Fix For: 2.2.x


 Raised these issues on CASSANDRA-5791, but didn't revert/re-open, so they 
 were ignored:
 We mark sstables that fail verification as unrepaired, but that's not going 
 to do what you think.  What it means is that the local node will use that 
 sstable in the next repair, but other nodes will not. So all we'll end up 
 doing is streaming whatever data we can read from it, to the other replicas.  
 If we could magically mark whatever sstables correspond on the remote nodes, 
 to the data in the local sstable, that would work, but we can't.
 IMO what we should do is:
 *scrub, because it's quite likely we'll fail reading from the sstable 
 otherwise and
 *full repair across the data range covered by the sstable
 Additionally,
 * I'm not sure that keeping extended verify code around is worth it. Since 
 the point is to work around not having a checksum, we could just scrub 
 instead. This is slightly more heavyweight but it would be a one-time cost 
 (scrub would build a new checksum) and we wouldn't have to worry about 
 keeping two versions of almost-the-same-code in sync.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)