[jira] [Comment Edited] (CASSANDRA-5791) A nodetool command to validate all sstables in a node

2015-03-25 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14370665#comment-14370665
 ] 

Jeff Jirsa edited comment on CASSANDRA-5791 at 3/26/15 3:25 AM:


So the same bug corrected in CASSANDRA-8778 was re-introduced by CASSANDRA-8709 
. 

I've done the following:

1) Rebased to trunk as of 20150325
2) Removed o.a.c.io.DataIntegrityMetadata#append
3) Corrected o.a.c.io.DataIntegrityMetadata#appendDirect
4) Brought over [~benedict]'s PureJavaCRC32 's fix from above (which was 
correct - 7bef6f93aea3a6897b53e909688f5948c018ccdf) 

Commit: 
https://github.com/jeffjirsa/cassandra/commit/daaa88878c023fbcd94d4aa7d02696a675a118dd
Diff at 
https://github.com/apache/cassandra/compare/trunk...jeffjirsa:cassandra-5791.diff
 , 

Passing: 
{noformat}
[junit] -  ---
[junit] Testsuite: org.apache.cassandra.db.VerifyTest
[junit] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
2.803 sec
[junit] 
{noformat}

[~jbellis] and [~benedict] - if you want unit tests for DataIntegrityMetadata, 
Jira it, assign me, and I'll write them. I'd have done it tonight but I can't 
convince myself that they're not redundant with the (included) verifier unit 
tests which will test the checksums anyway. 

(Will continue rebasing every few days as required - latest rebase 20150325, 
pull latest from github rather than using attached files)


was (Author: jjirsa):
So the same bug corrected in CASSANDRA-8778 was re-introduced by CASSANDRA-8709 
. 

I've done the following:

1) Rebased to trunk as of 20150319
2) Removed o.a.c.io.DataIntegrityMetadata#append
3) Corrected o.a.c.io.DataIntegrityMetadata#appendDirect
4) Brought over [~benedict]'s PureJavaCRC32 's fix from above (which was 
correct - 7bef6f93aea3a6897b53e909688f5948c018ccdf) 

Commit: 
https://github.com/jeffjirsa/cassandra/commit/79642ea4f56a33f249e807abdd562f89d20f6c36
Diff at 
https://github.com/apache/cassandra/compare/trunk...jeffjirsa:cassandra-5791.diff
 , I'll also attach here as cassandra-5791-20150319.diff

Passing: 
{noformat}
[junit] -  ---
[junit] Testsuite: org.apache.cassandra.db.VerifyTest
[junit] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
2.803 sec
[junit] 
{noformat}

[~jbellis] and [~benedict] - if you want unit tests for DataIntegrityMetadata, 
Jira it, assign me, and I'll write them. I'd have done it tonight but I can't 
convince myself that they're not redundant with the (included) verifier unit 
tests which will test the checksums anyway. 


 A nodetool command to validate all sstables in a node
 -

 Key: CASSANDRA-5791
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: sankalp kohli
Assignee: Jeff Jirsa
Priority: Minor
 Fix For: 3.0

 Attachments: cassandra-5791-20150319.diff, 
 cassandra-5791-patch-3.diff, cassandra-5791.patch-2


 CUrrently there is no nodetool command to validate all sstables on disk. The 
 only way to do this is to run a repair and see if it succeeds. But we cannot 
 repair the system keyspace. 
 Also we can run upgrade sstables but that re writes all the sstables. 
 This command should check the hash of all sstables and return whether all 
 data is readable all not. This should NOT care about consistency. 
 The compressed sstables do not have hash so not sure how it will work there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-5791) A nodetool command to validate all sstables in a node

2015-03-19 Thread Michael Shuler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368584#comment-14368584
 ] 

Michael Shuler edited comment on CASSANDRA-5791 at 3/19/15 6:30 AM:


trunk HEAD fails 4 of these new VerifyTest unit tests - can we fix this up?
{noformat}
[junit] Testsuite: org.apache.cassandra.db.VerifyTest
[junit] Tests run: 10, Failures: 4, Errors: 0, Skipped: 0, Time elapsed: 
3.161 sec
[junit] 
[junit] - Standard Output ---
[junit] WARN  06:21:48 No host ID found, created 
3921d9b6-df80-4a62-95cb-7b4ab506e29b (Note: This should happen exactly once per 
node).
[junit] WARN  06:21:48 No host ID found, created 
3921d9b6-df80-4a62-95cb-7b4ab506e29b (Note: This should happen exactly once per 
node).
[junit] -  ---
[junit] Testcase: testVerifyCorrect(org.apache.cassandra.db.VerifyTest):
FAILED
[junit] Unexpected CorruptSSTableException
[junit] junit.framework.AssertionFailedError: Unexpected 
CorruptSSTableException
[junit] at 
org.apache.cassandra.db.VerifyTest.testVerifyCorrect(VerifyTest.java:123)
[junit] 
[junit] 
[junit] Testcase: 
testVerifyCounterCorrect(org.apache.cassandra.db.VerifyTest): FAILED
[junit] Unexpected CorruptSSTableException
[junit] junit.framework.AssertionFailedError: Unexpected 
CorruptSSTableException
[junit] at 
org.apache.cassandra.db.VerifyTest.testVerifyCounterCorrect(VerifyTest.java:145)
[junit] 
[junit] 
[junit] Testcase: 
testExtendedVerifyCorrect(org.apache.cassandra.db.VerifyTest):FAILED
[junit] Unexpected CorruptSSTableException
[junit] junit.framework.AssertionFailedError: Unexpected 
CorruptSSTableException
[junit] at 
org.apache.cassandra.db.VerifyTest.testExtendedVerifyCorrect(VerifyTest.java:167)
[junit] 
[junit] 
[junit] Testcase: 
testExtendedVerifyCounterCorrect(org.apache.cassandra.db.VerifyTest): FAILED
[junit] Unexpected CorruptSSTableException
[junit] junit.framework.AssertionFailedError: Unexpected 
CorruptSSTableException
[junit] at 
org.apache.cassandra.db.VerifyTest.testExtendedVerifyCounterCorrect(VerifyTest.java:189)
[junit] 
[junit] 
[junit] Test org.apache.cassandra.db.VerifyTest FAILED
{noformat}


was (Author: mshuler):
trunk HEAD fails all these new VerifyTest unit tests - can we fix this up?
{noformat}
[junit] Testsuite: org.apache.cassandra.db.VerifyTest
[junit] Tests run: 10, Failures: 4, Errors: 0, Skipped: 0, Time elapsed: 
3.161 sec
[junit] 
[junit] - Standard Output ---
[junit] WARN  06:21:48 No host ID found, created 
3921d9b6-df80-4a62-95cb-7b4ab506e29b (Note: This should happen exactly once per 
node).
[junit] WARN  06:21:48 No host ID found, created 
3921d9b6-df80-4a62-95cb-7b4ab506e29b (Note: This should happen exactly once per 
node).
[junit] -  ---
[junit] Testcase: testVerifyCorrect(org.apache.cassandra.db.VerifyTest):
FAILED
[junit] Unexpected CorruptSSTableException
[junit] junit.framework.AssertionFailedError: Unexpected 
CorruptSSTableException
[junit] at 
org.apache.cassandra.db.VerifyTest.testVerifyCorrect(VerifyTest.java:123)
[junit] 
[junit] 
[junit] Testcase: 
testVerifyCounterCorrect(org.apache.cassandra.db.VerifyTest): FAILED
[junit] Unexpected CorruptSSTableException
[junit] junit.framework.AssertionFailedError: Unexpected 
CorruptSSTableException
[junit] at 
org.apache.cassandra.db.VerifyTest.testVerifyCounterCorrect(VerifyTest.java:145)
[junit] 
[junit] 
[junit] Testcase: 
testExtendedVerifyCorrect(org.apache.cassandra.db.VerifyTest):FAILED
[junit] Unexpected CorruptSSTableException
[junit] junit.framework.AssertionFailedError: Unexpected 
CorruptSSTableException
[junit] at 
org.apache.cassandra.db.VerifyTest.testExtendedVerifyCorrect(VerifyTest.java:167)
[junit] 
[junit] 
[junit] Testcase: 
testExtendedVerifyCounterCorrect(org.apache.cassandra.db.VerifyTest): FAILED
[junit] Unexpected CorruptSSTableException
[junit] junit.framework.AssertionFailedError: Unexpected 
CorruptSSTableException
[junit] at 
org.apache.cassandra.db.VerifyTest.testExtendedVerifyCounterCorrect(VerifyTest.java:189)
[junit] 
[junit] 
[junit] Test org.apache.cassandra.db.VerifyTest FAILED
{noformat}

 A nodetool command to validate all sstables in a node
 -

 Key: CASSANDRA-5791
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: sankalp kohli

[jira] [Comment Edited] (CASSANDRA-5791) A nodetool command to validate all sstables in a node

2015-03-19 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14370665#comment-14370665
 ] 

Jeff Jirsa edited comment on CASSANDRA-5791 at 3/20/15 4:54 AM:


So the same bug corrected in CASSANDRA-8778 was re-introduced by CASSANDRA-8709 
. 

I've done the following:

1) Rebased to trunk as of 20150319
2) Removed o.a.c.io.DataIntegrityMetadata#append
3) Corrected o.a.c.io.DataIntegrityMetadata#appendDirect
4) Brought over [~benedict]'s PureJavaCRC32 's fix from above (which was 
correct - 7bef6f93aea3a6897b53e909688f5948c018ccdf) 

Commit: 
https://github.com/jeffjirsa/cassandra/commit/79642ea4f56a33f249e807abdd562f89d20f6c36
Diff at 
https://github.com/apache/cassandra/compare/trunk...jeffjirsa:cassandra-5791.diff
 , I'll also attach here as cassandra-5791-20150319.diff

Passing: 
{noformat}
[junit] -  ---
[junit] Testsuite: org.apache.cassandra.db.VerifyTest
[junit] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
2.803 sec
[junit] 
{noformat}

[~jbellis] and [~benedict] - if you want unit tests for DataIntegrityMetadata, 
Jira it, assign me, and I'll write them. I'd have done it tonight but I can't 
convince myself that they're not redundant with the (included) verifier unit 
tests which will test the checksums anyway. 



was (Author: jjirsa):
So the same bug corrected in CASSANDRA-8778 was re-introduced by CASSANDRA-8709 
, as it was developed in parallel and was likely merged/reviewed without the 
benefit of knowing about #8778.

I've done the following:

1) Rebased to trunk as of 20150319
2) Removed o.a.c.io.DataIntegrityMetadata#append
3) Corrected o.a.c.io.DataIntegrityMetadata#appendDirect
4) Brought over [~benedict]'s PureJavaCRC32 's fix from above (which was 
correct - 7bef6f93aea3a6897b53e909688f5948c018ccdf) 

Commit: 
https://github.com/jeffjirsa/cassandra/commit/79642ea4f56a33f249e807abdd562f89d20f6c36
Diff at 
https://github.com/apache/cassandra/compare/trunk...jeffjirsa:cassandra-5791.diff
 , I'll also attach here as cassandra-5791-20150319.diff

Passing: 
{noformat}
[junit] -  ---
[junit] Testsuite: org.apache.cassandra.db.VerifyTest
[junit] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
2.803 sec
[junit] 
{noformat}

[~jbellis] and [~benedict] - if you want unit tests for DataIntegrityMetadata, 
Jira it, assign me, and I'll write them. I'd have done it tonight but I can't 
convince myself that they're not redundant with the (included) verifier unit 
tests which will test the checksums anyway. 


 A nodetool command to validate all sstables in a node
 -

 Key: CASSANDRA-5791
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: sankalp kohli
Assignee: Jeff Jirsa
Priority: Minor
 Fix For: 3.0

 Attachments: cassandra-5791-20150319.diff, 
 cassandra-5791-patch-3.diff, cassandra-5791.patch-2


 CUrrently there is no nodetool command to validate all sstables on disk. The 
 only way to do this is to run a repair and see if it succeeds. But we cannot 
 repair the system keyspace. 
 Also we can run upgrade sstables but that re writes all the sstables. 
 This command should check the hash of all sstables and return whether all 
 data is readable all not. This should NOT care about consistency. 
 The compressed sstables do not have hash so not sure how it will work there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-5791) A nodetool command to validate all sstables in a node

2015-03-19 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14370077#comment-14370077
 ] 

Jeff Jirsa edited comment on CASSANDRA-5791 at 3/19/15 8:43 PM:


Cause of tests failing is that checksums are incorrect for compressed sstables 
again. 

{noformat}
# cat 
/Users/jeff/.ccm/snapshot/node1/data/test2/metrics-aded07e0ce7711e4897c85b755fc16c4/la-1-big-Digest.adler32
 
822598308
# java AdlerCheckSum 
/Users/jeff/.ccm/snapshot/node1/data/test2/metrics-aded07e0ce7711e4897c85b755fc16c4/la-1-big-Data.db
864477438
{noformat}

The checksums should have been corrected by CASSANDRA-8778 so I'll figure out 
where the regression happened tonight after business hours PST.




was (Author: jjirsa):
Cause of tests failing is that checksums are incorrect for compressed sstables 
again. 

{noformat}
# cat 
/Users/jeff/.ccm/snapshot/node1/data/test2/metrics-aded07e0ce7711e4897c85b755fc16c4/la-1-big-Digest.adler32
 
822598308
# java AdlerCheckSum 
/Users/jeff/.ccm/snapshot/node1/data/test2/metrics-aded07e0ce7711e4897c85b755fc16c4/la-1-big-Data.db
864477438
{/oformat}

The checksums should have been corrected by CASSANDRA-8778 so I'll figure out 
where the regression happened tonight after business hours PST.



 A nodetool command to validate all sstables in a node
 -

 Key: CASSANDRA-5791
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: sankalp kohli
Assignee: Jeff Jirsa
Priority: Minor
 Fix For: 3.0

 Attachments: cassandra-5791-patch-3.diff, cassandra-5791.patch-2


 CUrrently there is no nodetool command to validate all sstables on disk. The 
 only way to do this is to run a repair and see if it succeeds. But we cannot 
 repair the system keyspace. 
 Also we can run upgrade sstables but that re writes all the sstables. 
 This command should check the hash of all sstables and return whether all 
 data is readable all not. This should NOT care about consistency. 
 The compressed sstables do not have hash so not sure how it will work there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-5791) A nodetool command to validate all sstables in a node

2015-02-20 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328280#comment-14328280
 ] 

Jeff Jirsa edited comment on CASSANDRA-5791 at 2/21/15 4:53 AM:


Thanks for the feedback.

On whether or not a missing digest indicates corruption.  In the case of a 
missing digest, does it make more sense to imply --extended and verify atoms? 
Doing that at least verifies the inline checksums for compressed sstables? 

Most of the remaining nits are 100% valid, and due to me basing this on the 
scrub path without eliminating all of the obsolete code. Cleaning up to 
address. Only nit that seems inconsistent: sstableverify.bat ability to specify 
CASSANDRA_MAIN is consistent with other similar tools (sstablescrub, 
sstableupgrade, sstableloader, sstablekeys)

Updated for nits : 

https://github.com/jeffjirsa/cassandra/compare/cassandra-5791 and/or 
https://github.com/jeffjirsa/cassandra/compare/cassandra-5791.diff




was (Author: jjirsa):
Thanks for the feedback.

On whether or not a missing digest indicates corruption.  In the case of a 
missing digest, does it make more sense to imply --extended and verify atoms? 
Doing that at least verifies the inline checksums for compressed sstables? 

Most of the remaining nits are 100% valid, and due to me basing this on the 
scrub path without eliminating all of the obsolete code. Cleaning up to 
address. Only nit that seems inconsistent: sstableverify.bat ability to specify 
CASSANDRA_MAIN is consistent with other similar tools (sstablescrub, 
sstableupgrade, sstableloader, sstablekeys)



 A nodetool command to validate all sstables in a node
 -

 Key: CASSANDRA-5791
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: sankalp kohli
Assignee: Jeff Jirsa
Priority: Minor
 Attachments: cassandra-5791.patch-2


 CUrrently there is no nodetool command to validate all sstables on disk. The 
 only way to do this is to run a repair and see if it succeeds. But we cannot 
 repair the system keyspace. 
 Also we can run upgrade sstables but that re writes all the sstables. 
 This command should check the hash of all sstables and return whether all 
 data is readable all not. This should NOT care about consistency. 
 The compressed sstables do not have hash so not sure how it will work there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-5791) A nodetool command to validate all sstables in a node

2015-02-10 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310025#comment-14310025
 ] 

Jeff Jirsa edited comment on CASSANDRA-5791 at 2/10/15 7:25 PM:


Duplicating my comment from 8703 here since its a dupe and prone to closure :

-I've got a version at 
https://github.com/jeffjirsa/cassandra/commits/cassandra-8703 that follows the 
scrub read path and implements nodetool verify / sstableverify. This works, for 
both compressed and uncompressed, but requires walking the entire sstable and 
verifies each on disk atom. This works, it just isn't very fast (though it is 
thorough).
The faster method will be checking against the Digest.sha1 file (which actually 
contains an adler32 hash), and skipping the full iteration. I'll rebase and 
work that in, using the 'walk all atoms' approach above as an optional extended 
verify (-e) or similar, unless someone objects. Also going to rename the DIGEST 
sstable component to Digest.adler32 since it's definitely not sha1 anymore.- 
(New patch attached)


was (Author: jjirsa):
Duplicating my comment from 8703 here since its a dupe and prone to closure :

I've got a version at 
https://github.com/jeffjirsa/cassandra/commits/cassandra-8703 that follows the 
scrub read path and implements nodetool verify / sstableverify. This works, for 
both compressed and uncompressed, but requires walking the entire sstable and 
verifies each on disk atom. This works, it just isn't very fast (though it is 
thorough).
The faster method will be checking against the Digest.sha1 file (which actually 
contains an adler32 hash), and skipping the full iteration. I'll rebase and 
work that in, using the 'walk all atoms' approach above as an optional extended 
verify (-e) or similar, unless someone objects. Also going to rename the DIGEST 
sstable component to Digest.adler32 since it's definitely not sha1 anymore.

 A nodetool command to validate all sstables in a node
 -

 Key: CASSANDRA-5791
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: sankalp kohli
Assignee: Jeff Jirsa
Priority: Minor
 Attachments: cassandra-5791.patch.txt


 CUrrently there is no nodetool command to validate all sstables on disk. The 
 only way to do this is to run a repair and see if it succeeds. But we cannot 
 repair the system keyspace. 
 Also we can run upgrade sstables but that re writes all the sstables. 
 This command should check the hash of all sstables and return whether all 
 data is readable all not. This should NOT care about consistency. 
 The compressed sstables do not have hash so not sure how it will work there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-5791) A nodetool command to validate all sstables in a node

2015-02-10 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310025#comment-14310025
 ] 

Jeff Jirsa edited comment on CASSANDRA-5791 at 2/10/15 7:25 PM:


Duplicating my comment from 8703 here since its a dupe and prone to closure :

-I've got a version at 
https://github.com/jeffjirsa/cassandra/commits/cassandra-8703 that follows the 
scrub read path and implements nodetool verify / sstableverify. This works, for 
both compressed and uncompressed, but requires walking the entire sstable and 
verifies each on disk atom. This works, it just isn't very fast (though it is 
thorough).
The faster method will be checking against the Digest.sha1 file (which actually 
contains an adler32 hash), and skipping the full iteration. I'll rebase and 
work that in, using the 'walk all atoms' approach above as an optional extended 
verify or similar, unless someone objects. Also going to rename the DIGEST 
sstable component to Digest.adler32 since it's definitely not sha1 anymore.- 
(New patch attached)


was (Author: jjirsa):
Duplicating my comment from 8703 here since its a dupe and prone to closure :

-I've got a version at 
https://github.com/jeffjirsa/cassandra/commits/cassandra-8703 that follows the 
scrub read path and implements nodetool verify / sstableverify. This works, for 
both compressed and uncompressed, but requires walking the entire sstable and 
verifies each on disk atom. This works, it just isn't very fast (though it is 
thorough).
The faster method will be checking against the Digest.sha1 file (which actually 
contains an adler32 hash), and skipping the full iteration. I'll rebase and 
work that in, using the 'walk all atoms' approach above as an optional extended 
verify (-e) or similar, unless someone objects. Also going to rename the DIGEST 
sstable component to Digest.adler32 since it's definitely not sha1 anymore.- 
(New patch attached)

 A nodetool command to validate all sstables in a node
 -

 Key: CASSANDRA-5791
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: sankalp kohli
Assignee: Jeff Jirsa
Priority: Minor
 Attachments: cassandra-5791.patch.txt


 CUrrently there is no nodetool command to validate all sstables on disk. The 
 only way to do this is to run a repair and see if it succeeds. But we cannot 
 repair the system keyspace. 
 Also we can run upgrade sstables but that re writes all the sstables. 
 This command should check the hash of all sstables and return whether all 
 data is readable all not. This should NOT care about consistency. 
 The compressed sstables do not have hash so not sure how it will work there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-5791) A nodetool command to validate all sstables in a node

2015-02-10 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310025#comment-14310025
 ] 

Jeff Jirsa edited comment on CASSANDRA-5791 at 2/10/15 7:26 PM:


Duplicating my comment from 8703 here since its a dupe and prone to closure :

-I've got a version at ... that follows the scrub read path and implements 
nodetool verify / sstableverify. This works, for both compressed and 
uncompressed, but requires walking the entire sstable and verifies each on disk 
atom. This works, it just isn't very fast (though it is thorough).
The faster method will be checking against the Digest.sha1 file (which actually 
contains an adler32 hash), and skipping the full iteration. I'll rebase and 
work that in, using the 'walk all atoms' approach above as an optional extended 
verify or similar, unless someone objects. Also going to rename the DIGEST 
sstable component to Digest.adler32 since it's definitely not sha1 anymore.- 
(New patch attached)


was (Author: jjirsa):
Duplicating my comment from 8703 here since its a dupe and prone to closure :

-I've got a version at 
https://github.com/jeffjirsa/cassandra/commits/cassandra-8703 that follows the 
scrub read path and implements nodetool verify / sstableverify. This works, for 
both compressed and uncompressed, but requires walking the entire sstable and 
verifies each on disk atom. This works, it just isn't very fast (though it is 
thorough).
The faster method will be checking against the Digest.sha1 file (which actually 
contains an adler32 hash), and skipping the full iteration. I'll rebase and 
work that in, using the 'walk all atoms' approach above as an optional extended 
verify or similar, unless someone objects. Also going to rename the DIGEST 
sstable component to Digest.adler32 since it's definitely not sha1 anymore.- 
(New patch attached)

 A nodetool command to validate all sstables in a node
 -

 Key: CASSANDRA-5791
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: sankalp kohli
Assignee: Jeff Jirsa
Priority: Minor
 Attachments: cassandra-5791.patch.txt


 CUrrently there is no nodetool command to validate all sstables on disk. The 
 only way to do this is to run a repair and see if it succeeds. But we cannot 
 repair the system keyspace. 
 Also we can run upgrade sstables but that re writes all the sstables. 
 This command should check the hash of all sstables and return whether all 
 data is readable all not. This should NOT care about consistency. 
 The compressed sstables do not have hash so not sure how it will work there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)