[ 
https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081569#comment-13081569
 ] 

Pavel Yaskevich edited comment on CASSANDRA-1717 at 8/9/11 11:25 AM:
---------------------------------------------------------------------

bq. CSW.flushData() forgot to reset the checksum (this is caught by the unit 
tests btw).

  Not a problem since it was due to Sylvain's bad merge.

bq. We should convert the CRC32 to an int (and only write that) as it is an int 
internally (getValue() returns a long only because CRC32 implements the 
interface Checksum that require that).

  Lets leave that to the ticket for CRC optimization which will allow us to 
modify that system-wide.

bq. Here we checksum the compressed data. The other approach would be to 
checksum the uncompressed data. The advantage of checksumming compressed data 
is the speed (less data to checksum), but checksumming the uncompressed data 
would be a little bit safer. In particular, it would prevent us from messing up 
in the decompression (and we don't have to trust the compression algorithm, not 
that I don't trust Snappy, but...). This is a clearly a trade-off that we have 
to make, but I admit that my personal preference would lean towards safety (in 
particular, I know that checksumming the uncompressed data give a bit more 
safety, I don't know what is our exact gain quantitatively with checksumming 
compressed data). On the other side, checksumming the uncompressed data would 
likely mean that a good part of the bitrot would result in a decompression 
error rather than a checksum error, which is maybe less convenient from the 
implementation point of view. So I don't know, I guess I'm thinking aloud to 
have other's opinions more than anything else.

  Checksum is moved to the original data.
 
bq. Let's add some unit tests. At least it's relatively easy to write a few 
blocks, switch one bit in the resulting file, and checking this is caught at 
read time (or better, do that multiple time changing a different bit each time).

  Test was added to CompressedRandomAccessReaderTest.

bq. As Todd noted, HADOOP-6148 contains a bunch of discussions on the 
efficiency of java CRC32. In particular, it seems they have been able to close 
to double the speed of the CRC32, with a solution that seems fairly simple to 
me. It would be ok to use java native CRC32 and leave the improvement to 
another ticket, but quite frankly if it is that simple and since the hadoop 
guys have done all the hard work for us, I say we start with the efficient 
version directly.

  As decided previously this will be a matter of the separate ticket.

Rebased with latest trunk (last commit 1e36fb1e44bff96005dd75a25648ff25eea6a95f)

      was (Author: xedin):
    bq. CSW.flushData() forgot to reset the checksum (this is caught by the 
unit tests btw).

  Not a problem since it was due to Sylvain's bad merge.

bq. We should convert the CRC32 to an int (and only write that) as it is an int 
internally (getValue() returns a long only because CRC32 implements the 
interface Checksum that require that).

  Lets leave that to the ticket for CRC optimization which will allow us to 
modify that system-wide.

bq. Here we checksum the compressed data. The other approach would be to 
checksum the uncompressed data. The advantage of checksumming compressed data 
is the speed (less data to checksum), but checksumming the uncompressed data 
would be a little bit safer. In particular, it would prevent us from messing up 
in the decompression (and we don't have to trust the compression algorithm, not 
that I don't trust Snappy, but...). This is a clearly a trade-off that we have 
to make, but I admit that my personal preference would lean towards safety (in 
particular, I know that checksumming the uncompressed data give a bit more 
safety, I don't know what is our exact gain quantitatively with checksumming 
compressed data). On the other side, checksumming the uncompressed data would 
likely mean that a good part of the bitrot would result in a decompression 
error rather than a checksum error, which is maybe less convenient from the 
implementation point of view. So I don't know, I guess I'm thinking aloud to 
have other's opinions more than anything else.

  Checksum is moved to the original data.
 
bq. Let's add some unit tests. At least it's relatively easy to write a few 
blocks, switch one bit in the resulting file, and checking this is caught at 
read time (or better, do that multiple time changing a different bit each time).

  Test was added to CompressedRandomAccessReaderTest.

As Todd noted, HADOOP-6148 contains a bunch of discussions on the efficiency of 
java CRC32. In particular, it seems they have been able to close to double the 
speed of the CRC32, with a solution that seems fairly simple to me. It would be 
ok to use java native CRC32 and leave the improvement to another ticket, but 
quite frankly if it is that simple and since the hadoop guys have done all the 
hard work for us, I say we start with the efficient version directly.

  As decided previously this will be a matter of the separate ticket.

Rebased with latest trunk (last commit 1e36fb1e44bff96005dd75a25648ff25eea6a95f)
  
> Cassandra cannot detect corrupt-but-readable column data
> --------------------------------------------------------
>
>                 Key: CASSANDRA-1717
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1717
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Pavel Yaskevich
>             Fix For: 1.0
>
>         Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717.patch, 
> checksums.txt
>
>
> Most corruptions of on-disk data due to bitrot render the column (or row) 
> unreadable, so the data can be replaced by read repair or anti-entropy.  But 
> if the corruption keeps column data readable we do not detect it, and if it 
> corrupts to a higher timestamp value can even resist being overwritten by 
> newer values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to