[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649200#comment-14649200 ] Jonathan Ellis commented on CASSANDRA-5791: --- Created CASSANDRA-9947 to follow up. A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Fix For: 2.2.0 beta 1 Attachments: cassandra-5791-20150319.diff, cassandra-5791-patch-3.diff, cassandra-5791.patch-2 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620541#comment-14620541 ] Jonathan Ellis commented on CASSANDRA-5791: --- bq. without the marking-unrepaired part, incremental repair does not handle the bitrot case formerly handled by non-incremental repair The problem I'm bringing up is that even *with* marking unrepaired, incremental repair does not handle bitrot. What we'd need to do is mark unrepaired the exact sstables that contain the same data as the bitrotted one, on the other replicas, which is impossible. Incremental repair syncs up all the replicas but it doesn't repair when we lose data that we once had but don't anymore, that's the tradeoff. So for bitrot, like other data loss, you need full repair. (But, we can do better here than a typical human would, by only doing a full repair on the range covered by the corrupt sstable.) A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Fix For: 2.2.0 beta 1 Attachments: cassandra-5791-20150319.diff, cassandra-5791-patch-3.diff, cassandra-5791.patch-2 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619966#comment-14619966 ] Robert Coli commented on CASSANDRA-5791: {quote} [...] so we mark sstables that fail verification as unrepaired? Because that's not going to help much [...] IMO what we should do is: # scrub, because it's quite likely we'll fail reading from the sstable otherwise and # full repair across the data range covered by the sstable {quote} This is the wrinkle from the (otherwise duplicate of this ticket) CASSANDRA-8703 : {quote} From my understanding, if bitrot is detected (via eg the CRC on the read path) then all SSTables containing the corrupted range need to be marked unrepaired on all replicas. Per marcuse@IRC, the naive/simplest response would be to just trigger a full repair in this case. {quote} As an operator, my personal interest in 5791/8703 is that without the marking-unrepaired part, incremental repair does not handle the bitrot case formerly handled by non-incremental repair. tl;dr - I am +1 to the above approach! A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Fix For: 2.2.0 beta 1 Attachments: cassandra-5791-20150319.diff, cassandra-5791-patch-3.diff, cassandra-5791.patch-2 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619190#comment-14619190 ] Jonathan Ellis commented on CASSANDRA-5791: --- ... so we mark sstables that fail verification as unrepaired? Because that's not going to help much: it means the local node will use that sstable in the next repair, but other nodes will not. So all we'll end up doing is streaming whatever data we can read from it, to the other replicas. IMO what we should do is: # scrub, because it's quite likely we'll fail reading from the sstable otherwise and # full repair across the data range covered by the sstable A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Fix For: 2.2.0 beta 1 Attachments: cassandra-5791-20150319.diff, cassandra-5791-patch-3.diff, cassandra-5791.patch-2 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619194#comment-14619194 ] Jonathan Ellis commented on CASSANDRA-5791: --- More minor point: I'm not sure that keeping extended verify code around is worth it. Since the point is to work around not having a checksum, we could just scrub instead. This is slightly more heavyweight but it would be a one-time cost (scrub would build a new checksum) and we wouldn't have to worry about keeping two versions of almost-the-same-code in sync. A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Fix For: 2.2.0 beta 1 Attachments: cassandra-5791-20150319.diff, cassandra-5791-patch-3.diff, cassandra-5791.patch-2 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391891#comment-14391891 ] Dave Brosius commented on CASSANDRA-5791: - ok, thanks. i ninja'ed a minor change to make it more obvious this was intentional, so as not to confuse the reader -SSTableIdentityIterator atoms = new SSTableIdentityIterator(sstable, dataFile, key, true); +//mimic the scrub read path +new SSTableIdentityIterator(sstable, dataFile, key, true); A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Fix For: 3.0 Attachments: cassandra-5791-20150319.diff, cassandra-5791-patch-3.diff, cassandra-5791.patch-2 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391712#comment-14391712 ] Jeff Jirsa commented on CASSANDRA-5791: --- Sylvain asked the same thing in IRC this morning - the intent was to mimic the scrub read path, which would allow the SSTableIdentityIterator to discover compression checksum exceptions while decompressing, though - as Sylvain pointed out to me - it's probably insufficient for large partitions. I'll need investigate to determine whether or not it will actually throw on blocks beyond the start of the partition. A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Fix For: 3.0 Attachments: cassandra-5791-20150319.diff, cassandra-5791-patch-3.diff, cassandra-5791.patch-2 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391691#comment-14391691 ] Dave Brosius commented on CASSANDRA-5791: - In Verifier ~ line 187 in method public void verify(boolean extended) throws IOException is SSTableIdentityIterator atoms = new SSTableIdentityIterator(sstable, dataFile, key, true); is that doing something here? a side effect perhaps, or just left over to be removed? A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Fix For: 3.0 Attachments: cassandra-5791-20150319.diff, cassandra-5791-patch-3.diff, cassandra-5791.patch-2 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388344#comment-14388344 ] Branimir Lambov commented on CASSANDRA-5791: +1 I agree that there isn't need to test DataIntegrityMetadata separately. A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Fix For: 3.0 Attachments: cassandra-5791-20150319.diff, cassandra-5791-patch-3.diff, cassandra-5791.patch-2 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14370665#comment-14370665 ] Jeff Jirsa commented on CASSANDRA-5791: --- So the same bug corrected in CASSANDRA-8778 was re-introduced by CASSANDRA-8709 , as it was developed in parallel and was likely merged/reviewed without the benefit of knowing about #8778. I've done the following: 1) Rebased to trunk as of 20150319 2) Removed o.a.c.io.DataIntegrityMetadata#append 3) Corrected o.a.c.io.DataIntegrityMetadata#appendDirect 4) Brought over [~benedict]'s PureJavaCRC32 's fix from above (which was correct - 7bef6f93aea3a6897b53e909688f5948c018ccdf) Commit: https://github.com/jeffjirsa/cassandra/commit/79642ea4f56a33f249e807abdd562f89d20f6c36 Diff at https://github.com/apache/cassandra/compare/trunk...jeffjirsa:cassandra-5791.diff , I'll also attach here as cassandra-5791-20150319.diff Passing: {noformat} [junit] - --- [junit] Testsuite: org.apache.cassandra.db.VerifyTest [junit] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.803 sec [junit] {noformat} [~jbellis] and [~benedict] - if you want unit tests for DataIntegrityMetadata, Jira it, assign me, and I'll write them. I'd have done it tonight but I can't convince myself that they're not redundant with the (included) verifier unit tests which will test the checksums anyway. A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Fix For: 3.0 Attachments: cassandra-5791-20150319.diff, cassandra-5791-patch-3.diff, cassandra-5791.patch-2 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14369750#comment-14369750 ] Jonathan Ellis commented on CASSANDRA-5791: --- reverted in b25adc765769869d16410f1ca156227745d9b17b until the tests can be fixed A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Fix For: 3.0 Attachments: cassandra-5791-patch-3.diff, cassandra-5791.patch-2 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14370077#comment-14370077 ] Jeff Jirsa commented on CASSANDRA-5791: --- Cause of tests failing is that checksums are incorrect for compressed sstables again. {noformat} # cat /Users/jeff/.ccm/snapshot/node1/data/test2/metrics-aded07e0ce7711e4897c85b755fc16c4/la-1-big-Digest.adler32 822598308 # java AdlerCheckSum /Users/jeff/.ccm/snapshot/node1/data/test2/metrics-aded07e0ce7711e4897c85b755fc16c4/la-1-big-Data.db 864477438 {/oformat} The checksums should have been corrected by CASSANDRA-8778 so I'll figure out where the regression happened tonight after business hours PST. A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Fix For: 3.0 Attachments: cassandra-5791-patch-3.diff, cassandra-5791.patch-2 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366959#comment-14366959 ] Benedict commented on CASSANDRA-5791: - Looks like this referenced PureJavaCRC32, which no longer is found on trunk. Could you have a look to confirm the patch otherwise applied cleanly? A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Fix For: 3.0 Attachments: cassandra-5791-patch-3.diff, cassandra-5791.patch-2 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364963#comment-14364963 ] Branimir Lambov commented on CASSANDRA-5791: +1 [~benedict], I think the patch is ready. Could you commit it? A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Attachments: cassandra-5791-patch-3.diff, cassandra-5791.patch-2 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355775#comment-14355775 ] Jeff Jirsa commented on CASSANDRA-5791: --- New Diff at https://github.com/apache/cassandra/compare/trunk...jeffjirsa:cassandra-5791.diff A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Attachments: cassandra-5791-patch-3.diff, cassandra-5791.patch-2 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353049#comment-14353049 ] Branimir Lambov commented on CASSANDRA-5791: +1, with two nits: [DataIntegrityMetadata 115,125|https://github.com/jeffjirsa/cassandra/commit/89c1153def3f0ef0804d45883d12b09e04bb872d#diff-be889b1991c498fde94c039b5e327269R125]: This looks risky. As {{Verifier}} will not try to build a {{FileDigestValidator}} when the digest is not present, should we still have this special case? [StandaloneVerifier 126|https://github.com/jeffjirsa/cassandra/commit/89c1153def3f0ef0804d45883d12b09e04bb872d#diff-85d9fb1ffe8c937029e4ec870f662f6fR126]: You may want to exit with code 1 if {{hasFailed}} is true. A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Attachments: cassandra-5791-patch-3.diff, cassandra-5791.patch-2 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14351923#comment-14351923 ] Jeff Jirsa commented on CASSANDRA-5791: --- Since [~benedict] pulled in CASSANDRA-8778 (thanks Benedict!), I've rebased to exclude that patch from this change set, added unit tests, and collapsed all requested changes into a single commit for easy merging. Find it attached, or online at: https://github.com/jeffjirsa/cassandra/compare/cassandra-5791.diff or https://github.com/jeffjirsa/cassandra/commit/89c1153def3f0ef0804d45883d12b09e04bb872d A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Attachments: cassandra-5791-patch-3.diff, cassandra-5791.patch-2 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338652#comment-14338652 ] Branimir Lambov commented on CASSANDRA-5791: Thanks for the updates. Please extract the mutateRepairedAt/throw sequence to a private method to avoid the repetition. The verifier also needs a test, preferably one that writes an SSTable and checks both that it validates correctly (so that anyone changing SSTable format has to update this as well) and that changing a byte in it/its index causes a validation failure. A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Attachments: cassandra-5791.patch-2 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14327399#comment-14327399 ] Branimir Lambov commented on CASSANDRA-5791: Looks mostly good, I have a few questions about things you don't currently treat as failures and a couple of nits: [Verifier.java 104|https://github.com/jeffjirsa/cassandra/commit/7367b2fd88fe025006c783d6e07c8a4890690f0a#diff-fa9585e54785eef42f1245aceda15806R104]: Should we not mutateRepaired on missing digest? [Verifier.java 130|https://github.com/jeffjirsa/cassandra/commit/7367b2fd88fe025006c783d6e07c8a4890690f0a#diff-fa9585e54785eef42f1245aceda15806R130]: If the first position in the index is always supposed to be 0, doesn't a different value signify an index corruption? Why assert instead of treating this as a validation failure? [Verifier.java 202|https://github.com/jeffjirsa/cassandra/commit/7367b2fd88fe025006c783d6e07c8a4890690f0a#diff-fa9585e54785eef42f1245aceda15806R202] and [168|https://github.com/jeffjirsa/cassandra/commit/7367b2fd88fe025006c783d6e07c8a4890690f0a#diff-fa9585e54785eef42f1245aceda15806R168]: You only warn on invalid index entries, but still use {{nextRowPositionFromIndex}} for walking the data-- shouldn't then an invalid index be also treated as a validation failure? [Verifier.java 196|https://github.com/jeffjirsa/cassandra/commit/7367b2fd88fe025006c783d6e07c8a4890690f0a#diff-fa9585e54785eef42f1245aceda15806R196]: Is this a good nor bad row? Shouldn't order violation be a validation error? [Verifier.java 227|https://github.com/jeffjirsa/cassandra/commit/7367b2fd88fe025006c783d6e07c8a4890690f0a#diff-fa9585e54785eef42f1245aceda15806R227]: Is this code reachable? The only place that sets badRows also throws. [StandaloneVerifier.java 128|https://github.com/jeffjirsa/cassandra/commit/7367b2fd88fe025006c783d6e07c8a4890690f0a#diff-85d9fb1ffe8c937029e4ec870f662f6fR128]: Should we stop on the first error? If there are more problems with the data wouldn't we want to mark all ranges as unrepaired? [sstableverify.bat 23|https://github.com/jeffjirsa/cassandra/commit/7367b2fd88fe025006c783d6e07c8a4890690f0a#diff-05cb8af4cf28268cb3ca58e70e47da22R23]: I'd remove the possibility to specify {{CASSANDRA_MAIN}} to avoid unexpected behaviour for people who might have it set for other purposes. Put the class name directly into the command as in {{nodetool.bat}}. [DataIntegrityMetadata.java 130|https://github.com/jeffjirsa/cassandra/commit/7367b2fd88fe025006c783d6e07c8a4890690f0a#diff-be889b1991c498fde94c039b5e327269R130]: Why not just use {{while (true)}} with {{break}} in the validate loop? [Component.java 50|https://github.com/jeffjirsa/cassandra/commit/7367b2fd88fe025006c783d6e07c8a4890690f0a#diff-2292395ba109e6df9e1650745810d30aR50]: Update comment too. [NodeTool.java 1313|https://github.com/jeffjirsa/cassandra/commit/7367b2fd88fe025006c783d6e07c8a4890690f0a#diff-1c11f86cc8881893cd6d369ffc23a809R1313]: Change the error message (flush-verification). A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Attachments: cassandra-5791.patch-2 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328280#comment-14328280 ] Jeff Jirsa commented on CASSANDRA-5791: --- Thanks for the feedback. On whether or not a missing digest indicates corruption. In the case of a missing digest, does it make more sense to imply --extended and verify atoms? Doing that at least verifies the inline checksums for compressed sstables? Most of the remaining nits are 100% valid, and due to me basing this on the scrub path without eliminating all of the obsolete code. Cleaning up to address. Only nit that seems inconsistent: sstableverify.bat ability to specify CASSANDRA_MAIN is consistent with other similar tools (sstablescrub, sstableupgrade, sstableloader, sstablekeys) A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Attachments: cassandra-5791.patch-2 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319695#comment-14319695 ] Jeff Jirsa commented on CASSANDRA-5791: --- Formerly attached cassandra-5791.patch.txt creates invalid Digests for Uncompressed tables (related to #8778). Will modify the patch to separate the fix for #8778 from the rest of #5791 A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310025#comment-14310025 ] Jeff Jirsa commented on CASSANDRA-5791: --- Duplicating my comment from 8703 here since its a dupe and prone to closure : I've got a version at https://github.com/jeffjirsa/cassandra/commits/cassandra-8703 that follows the scrub read path and implements nodetool verify / sstableverify. This works, for both compressed and uncompressed, but requires walking the entire sstable and verifies each on disk atom. This works, it just isn't very fast (though it is thorough). The faster method will be checking against the Digest.sha1 file (which actually contains an adler32 hash), and skipping the full iteration. I'll rebase and work that in, using the 'walk all atoms' approach above as an optional extended verify (-e) or similar, unless someone objects. Also going to rename the DIGEST sstable component to Digest.adler32 since it's definitely not sha1 anymore. A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Priority: Minor CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14181858#comment-14181858 ] John Sumsion commented on CASSANDRA-5791: - If there is bitrot that causes a checksum failure, I assume that this issue would cause the configured disk_failure_policy to take effect, is that true? A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Priority: Minor CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934263#comment-13934263 ] Jonathan Ellis commented on CASSANDRA-5791: --- 4165 is committed A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Priority: Minor CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841241#comment-13841241 ] Radovan Zvoncek commented on CASSANDRA-5791: Once CASSANDRA-4165 gets solved, it might be quite neat to reuse the upgradesstables code to iterate over stables. This way we'd only need to add verifying the digests. As long as nobody is currently looking into this, I'd like to give it a shot. A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Priority: Minor CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13740301#comment-13740301 ] sankalp kohli commented on CASSANDRA-5791: -- We might also want to check whether sstable is sorted and there is no overlap. Something like an online read only scrub A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Tyler Hobbs Priority: Minor Fix For: 1.2.9 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732273#comment-13732273 ] Jonathan Ellis commented on CASSANDRA-5791: --- This may be something we can slip into 1.2.x, but if it looks hairy we'll push to 2.0.y. A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Tyler Hobbs Priority: Minor Fix For: 1.2.9 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13715695#comment-13715695 ] Brandon Williams commented on CASSANDRA-5791: - I think we should write a checksum for compressed sstables to make this easier, since it's trivial compared to extracting info from the CompressionInfo component. Then it's just a matter of iterating the sstables and comparing checksums. A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: Bug Components: Core Reporter: sankalp kohli Priority: Minor CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira