[jira] [Comment Edited] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14370665#comment-14370665 ] Jeff Jirsa edited comment on CASSANDRA-5791 at 3/26/15 3:25 AM: So the same bug corrected in CASSANDRA-8778 was re-introduced by CASSANDRA-8709 . I've done the following: 1) Rebased to trunk as of 20150325 2) Removed o.a.c.io.DataIntegrityMetadata#append 3) Corrected o.a.c.io.DataIntegrityMetadata#appendDirect 4) Brought over [~benedict]'s PureJavaCRC32 's fix from above (which was correct - 7bef6f93aea3a6897b53e909688f5948c018ccdf) Commit: https://github.com/jeffjirsa/cassandra/commit/daaa88878c023fbcd94d4aa7d02696a675a118dd Diff at https://github.com/apache/cassandra/compare/trunk...jeffjirsa:cassandra-5791.diff , Passing: {noformat} [junit] - --- [junit] Testsuite: org.apache.cassandra.db.VerifyTest [junit] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.803 sec [junit] {noformat} [~jbellis] and [~benedict] - if you want unit tests for DataIntegrityMetadata, Jira it, assign me, and I'll write them. I'd have done it tonight but I can't convince myself that they're not redundant with the (included) verifier unit tests which will test the checksums anyway. (Will continue rebasing every few days as required - latest rebase 20150325, pull latest from github rather than using attached files) was (Author: jjirsa): So the same bug corrected in CASSANDRA-8778 was re-introduced by CASSANDRA-8709 . I've done the following: 1) Rebased to trunk as of 20150319 2) Removed o.a.c.io.DataIntegrityMetadata#append 3) Corrected o.a.c.io.DataIntegrityMetadata#appendDirect 4) Brought over [~benedict]'s PureJavaCRC32 's fix from above (which was correct - 7bef6f93aea3a6897b53e909688f5948c018ccdf) Commit: https://github.com/jeffjirsa/cassandra/commit/79642ea4f56a33f249e807abdd562f89d20f6c36 Diff at https://github.com/apache/cassandra/compare/trunk...jeffjirsa:cassandra-5791.diff , I'll also attach here as cassandra-5791-20150319.diff Passing: {noformat} [junit] - --- [junit] Testsuite: org.apache.cassandra.db.VerifyTest [junit] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.803 sec [junit] {noformat} [~jbellis] and [~benedict] - if you want unit tests for DataIntegrityMetadata, Jira it, assign me, and I'll write them. I'd have done it tonight but I can't convince myself that they're not redundant with the (included) verifier unit tests which will test the checksums anyway. A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Fix For: 3.0 Attachments: cassandra-5791-20150319.diff, cassandra-5791-patch-3.diff, cassandra-5791.patch-2 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368584#comment-14368584 ] Michael Shuler edited comment on CASSANDRA-5791 at 3/19/15 6:30 AM: trunk HEAD fails 4 of these new VerifyTest unit tests - can we fix this up? {noformat} [junit] Testsuite: org.apache.cassandra.db.VerifyTest [junit] Tests run: 10, Failures: 4, Errors: 0, Skipped: 0, Time elapsed: 3.161 sec [junit] [junit] - Standard Output --- [junit] WARN 06:21:48 No host ID found, created 3921d9b6-df80-4a62-95cb-7b4ab506e29b (Note: This should happen exactly once per node). [junit] WARN 06:21:48 No host ID found, created 3921d9b6-df80-4a62-95cb-7b4ab506e29b (Note: This should happen exactly once per node). [junit] - --- [junit] Testcase: testVerifyCorrect(org.apache.cassandra.db.VerifyTest): FAILED [junit] Unexpected CorruptSSTableException [junit] junit.framework.AssertionFailedError: Unexpected CorruptSSTableException [junit] at org.apache.cassandra.db.VerifyTest.testVerifyCorrect(VerifyTest.java:123) [junit] [junit] [junit] Testcase: testVerifyCounterCorrect(org.apache.cassandra.db.VerifyTest): FAILED [junit] Unexpected CorruptSSTableException [junit] junit.framework.AssertionFailedError: Unexpected CorruptSSTableException [junit] at org.apache.cassandra.db.VerifyTest.testVerifyCounterCorrect(VerifyTest.java:145) [junit] [junit] [junit] Testcase: testExtendedVerifyCorrect(org.apache.cassandra.db.VerifyTest):FAILED [junit] Unexpected CorruptSSTableException [junit] junit.framework.AssertionFailedError: Unexpected CorruptSSTableException [junit] at org.apache.cassandra.db.VerifyTest.testExtendedVerifyCorrect(VerifyTest.java:167) [junit] [junit] [junit] Testcase: testExtendedVerifyCounterCorrect(org.apache.cassandra.db.VerifyTest): FAILED [junit] Unexpected CorruptSSTableException [junit] junit.framework.AssertionFailedError: Unexpected CorruptSSTableException [junit] at org.apache.cassandra.db.VerifyTest.testExtendedVerifyCounterCorrect(VerifyTest.java:189) [junit] [junit] [junit] Test org.apache.cassandra.db.VerifyTest FAILED {noformat} was (Author: mshuler): trunk HEAD fails all these new VerifyTest unit tests - can we fix this up? {noformat} [junit] Testsuite: org.apache.cassandra.db.VerifyTest [junit] Tests run: 10, Failures: 4, Errors: 0, Skipped: 0, Time elapsed: 3.161 sec [junit] [junit] - Standard Output --- [junit] WARN 06:21:48 No host ID found, created 3921d9b6-df80-4a62-95cb-7b4ab506e29b (Note: This should happen exactly once per node). [junit] WARN 06:21:48 No host ID found, created 3921d9b6-df80-4a62-95cb-7b4ab506e29b (Note: This should happen exactly once per node). [junit] - --- [junit] Testcase: testVerifyCorrect(org.apache.cassandra.db.VerifyTest): FAILED [junit] Unexpected CorruptSSTableException [junit] junit.framework.AssertionFailedError: Unexpected CorruptSSTableException [junit] at org.apache.cassandra.db.VerifyTest.testVerifyCorrect(VerifyTest.java:123) [junit] [junit] [junit] Testcase: testVerifyCounterCorrect(org.apache.cassandra.db.VerifyTest): FAILED [junit] Unexpected CorruptSSTableException [junit] junit.framework.AssertionFailedError: Unexpected CorruptSSTableException [junit] at org.apache.cassandra.db.VerifyTest.testVerifyCounterCorrect(VerifyTest.java:145) [junit] [junit] [junit] Testcase: testExtendedVerifyCorrect(org.apache.cassandra.db.VerifyTest):FAILED [junit] Unexpected CorruptSSTableException [junit] junit.framework.AssertionFailedError: Unexpected CorruptSSTableException [junit] at org.apache.cassandra.db.VerifyTest.testExtendedVerifyCorrect(VerifyTest.java:167) [junit] [junit] [junit] Testcase: testExtendedVerifyCounterCorrect(org.apache.cassandra.db.VerifyTest): FAILED [junit] Unexpected CorruptSSTableException [junit] junit.framework.AssertionFailedError: Unexpected CorruptSSTableException [junit] at org.apache.cassandra.db.VerifyTest.testExtendedVerifyCounterCorrect(VerifyTest.java:189) [junit] [junit] [junit] Test org.apache.cassandra.db.VerifyTest FAILED {noformat} A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli
[jira] [Comment Edited] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14370665#comment-14370665 ] Jeff Jirsa edited comment on CASSANDRA-5791 at 3/20/15 4:54 AM: So the same bug corrected in CASSANDRA-8778 was re-introduced by CASSANDRA-8709 . I've done the following: 1) Rebased to trunk as of 20150319 2) Removed o.a.c.io.DataIntegrityMetadata#append 3) Corrected o.a.c.io.DataIntegrityMetadata#appendDirect 4) Brought over [~benedict]'s PureJavaCRC32 's fix from above (which was correct - 7bef6f93aea3a6897b53e909688f5948c018ccdf) Commit: https://github.com/jeffjirsa/cassandra/commit/79642ea4f56a33f249e807abdd562f89d20f6c36 Diff at https://github.com/apache/cassandra/compare/trunk...jeffjirsa:cassandra-5791.diff , I'll also attach here as cassandra-5791-20150319.diff Passing: {noformat} [junit] - --- [junit] Testsuite: org.apache.cassandra.db.VerifyTest [junit] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.803 sec [junit] {noformat} [~jbellis] and [~benedict] - if you want unit tests for DataIntegrityMetadata, Jira it, assign me, and I'll write them. I'd have done it tonight but I can't convince myself that they're not redundant with the (included) verifier unit tests which will test the checksums anyway. was (Author: jjirsa): So the same bug corrected in CASSANDRA-8778 was re-introduced by CASSANDRA-8709 , as it was developed in parallel and was likely merged/reviewed without the benefit of knowing about #8778. I've done the following: 1) Rebased to trunk as of 20150319 2) Removed o.a.c.io.DataIntegrityMetadata#append 3) Corrected o.a.c.io.DataIntegrityMetadata#appendDirect 4) Brought over [~benedict]'s PureJavaCRC32 's fix from above (which was correct - 7bef6f93aea3a6897b53e909688f5948c018ccdf) Commit: https://github.com/jeffjirsa/cassandra/commit/79642ea4f56a33f249e807abdd562f89d20f6c36 Diff at https://github.com/apache/cassandra/compare/trunk...jeffjirsa:cassandra-5791.diff , I'll also attach here as cassandra-5791-20150319.diff Passing: {noformat} [junit] - --- [junit] Testsuite: org.apache.cassandra.db.VerifyTest [junit] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.803 sec [junit] {noformat} [~jbellis] and [~benedict] - if you want unit tests for DataIntegrityMetadata, Jira it, assign me, and I'll write them. I'd have done it tonight but I can't convince myself that they're not redundant with the (included) verifier unit tests which will test the checksums anyway. A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Fix For: 3.0 Attachments: cassandra-5791-20150319.diff, cassandra-5791-patch-3.diff, cassandra-5791.patch-2 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14370077#comment-14370077 ] Jeff Jirsa edited comment on CASSANDRA-5791 at 3/19/15 8:43 PM: Cause of tests failing is that checksums are incorrect for compressed sstables again. {noformat} # cat /Users/jeff/.ccm/snapshot/node1/data/test2/metrics-aded07e0ce7711e4897c85b755fc16c4/la-1-big-Digest.adler32 822598308 # java AdlerCheckSum /Users/jeff/.ccm/snapshot/node1/data/test2/metrics-aded07e0ce7711e4897c85b755fc16c4/la-1-big-Data.db 864477438 {noformat} The checksums should have been corrected by CASSANDRA-8778 so I'll figure out where the regression happened tonight after business hours PST. was (Author: jjirsa): Cause of tests failing is that checksums are incorrect for compressed sstables again. {noformat} # cat /Users/jeff/.ccm/snapshot/node1/data/test2/metrics-aded07e0ce7711e4897c85b755fc16c4/la-1-big-Digest.adler32 822598308 # java AdlerCheckSum /Users/jeff/.ccm/snapshot/node1/data/test2/metrics-aded07e0ce7711e4897c85b755fc16c4/la-1-big-Data.db 864477438 {/oformat} The checksums should have been corrected by CASSANDRA-8778 so I'll figure out where the regression happened tonight after business hours PST. A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Fix For: 3.0 Attachments: cassandra-5791-patch-3.diff, cassandra-5791.patch-2 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328280#comment-14328280 ] Jeff Jirsa edited comment on CASSANDRA-5791 at 2/21/15 4:53 AM: Thanks for the feedback. On whether or not a missing digest indicates corruption. In the case of a missing digest, does it make more sense to imply --extended and verify atoms? Doing that at least verifies the inline checksums for compressed sstables? Most of the remaining nits are 100% valid, and due to me basing this on the scrub path without eliminating all of the obsolete code. Cleaning up to address. Only nit that seems inconsistent: sstableverify.bat ability to specify CASSANDRA_MAIN is consistent with other similar tools (sstablescrub, sstableupgrade, sstableloader, sstablekeys) Updated for nits : https://github.com/jeffjirsa/cassandra/compare/cassandra-5791 and/or https://github.com/jeffjirsa/cassandra/compare/cassandra-5791.diff was (Author: jjirsa): Thanks for the feedback. On whether or not a missing digest indicates corruption. In the case of a missing digest, does it make more sense to imply --extended and verify atoms? Doing that at least verifies the inline checksums for compressed sstables? Most of the remaining nits are 100% valid, and due to me basing this on the scrub path without eliminating all of the obsolete code. Cleaning up to address. Only nit that seems inconsistent: sstableverify.bat ability to specify CASSANDRA_MAIN is consistent with other similar tools (sstablescrub, sstableupgrade, sstableloader, sstablekeys) A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Attachments: cassandra-5791.patch-2 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310025#comment-14310025 ] Jeff Jirsa edited comment on CASSANDRA-5791 at 2/10/15 7:25 PM: Duplicating my comment from 8703 here since its a dupe and prone to closure : -I've got a version at https://github.com/jeffjirsa/cassandra/commits/cassandra-8703 that follows the scrub read path and implements nodetool verify / sstableverify. This works, for both compressed and uncompressed, but requires walking the entire sstable and verifies each on disk atom. This works, it just isn't very fast (though it is thorough). The faster method will be checking against the Digest.sha1 file (which actually contains an adler32 hash), and skipping the full iteration. I'll rebase and work that in, using the 'walk all atoms' approach above as an optional extended verify (-e) or similar, unless someone objects. Also going to rename the DIGEST sstable component to Digest.adler32 since it's definitely not sha1 anymore.- (New patch attached) was (Author: jjirsa): Duplicating my comment from 8703 here since its a dupe and prone to closure : I've got a version at https://github.com/jeffjirsa/cassandra/commits/cassandra-8703 that follows the scrub read path and implements nodetool verify / sstableverify. This works, for both compressed and uncompressed, but requires walking the entire sstable and verifies each on disk atom. This works, it just isn't very fast (though it is thorough). The faster method will be checking against the Digest.sha1 file (which actually contains an adler32 hash), and skipping the full iteration. I'll rebase and work that in, using the 'walk all atoms' approach above as an optional extended verify (-e) or similar, unless someone objects. Also going to rename the DIGEST sstable component to Digest.adler32 since it's definitely not sha1 anymore. A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Attachments: cassandra-5791.patch.txt CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310025#comment-14310025 ] Jeff Jirsa edited comment on CASSANDRA-5791 at 2/10/15 7:25 PM: Duplicating my comment from 8703 here since its a dupe and prone to closure : -I've got a version at https://github.com/jeffjirsa/cassandra/commits/cassandra-8703 that follows the scrub read path and implements nodetool verify / sstableverify. This works, for both compressed and uncompressed, but requires walking the entire sstable and verifies each on disk atom. This works, it just isn't very fast (though it is thorough). The faster method will be checking against the Digest.sha1 file (which actually contains an adler32 hash), and skipping the full iteration. I'll rebase and work that in, using the 'walk all atoms' approach above as an optional extended verify or similar, unless someone objects. Also going to rename the DIGEST sstable component to Digest.adler32 since it's definitely not sha1 anymore.- (New patch attached) was (Author: jjirsa): Duplicating my comment from 8703 here since its a dupe and prone to closure : -I've got a version at https://github.com/jeffjirsa/cassandra/commits/cassandra-8703 that follows the scrub read path and implements nodetool verify / sstableverify. This works, for both compressed and uncompressed, but requires walking the entire sstable and verifies each on disk atom. This works, it just isn't very fast (though it is thorough). The faster method will be checking against the Digest.sha1 file (which actually contains an adler32 hash), and skipping the full iteration. I'll rebase and work that in, using the 'walk all atoms' approach above as an optional extended verify (-e) or similar, unless someone objects. Also going to rename the DIGEST sstable component to Digest.adler32 since it's definitely not sha1 anymore.- (New patch attached) A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Attachments: cassandra-5791.patch.txt CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310025#comment-14310025 ] Jeff Jirsa edited comment on CASSANDRA-5791 at 2/10/15 7:26 PM: Duplicating my comment from 8703 here since its a dupe and prone to closure : -I've got a version at ... that follows the scrub read path and implements nodetool verify / sstableverify. This works, for both compressed and uncompressed, but requires walking the entire sstable and verifies each on disk atom. This works, it just isn't very fast (though it is thorough). The faster method will be checking against the Digest.sha1 file (which actually contains an adler32 hash), and skipping the full iteration. I'll rebase and work that in, using the 'walk all atoms' approach above as an optional extended verify or similar, unless someone objects. Also going to rename the DIGEST sstable component to Digest.adler32 since it's definitely not sha1 anymore.- (New patch attached) was (Author: jjirsa): Duplicating my comment from 8703 here since its a dupe and prone to closure : -I've got a version at https://github.com/jeffjirsa/cassandra/commits/cassandra-8703 that follows the scrub read path and implements nodetool verify / sstableverify. This works, for both compressed and uncompressed, but requires walking the entire sstable and verifies each on disk atom. This works, it just isn't very fast (though it is thorough). The faster method will be checking against the Digest.sha1 file (which actually contains an adler32 hash), and skipping the full iteration. I'll rebase and work that in, using the 'walk all atoms' approach above as an optional extended verify or similar, unless someone objects. Also going to rename the DIGEST sstable component to Digest.adler32 since it's definitely not sha1 anymore.- (New patch attached) A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Jeff Jirsa Priority: Minor Attachments: cassandra-5791.patch.txt CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)