[ 
https://issues.apache.org/jira/browse/KUDU-463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15494141#comment-15494141
 ] 

Will Berkeley commented on KUDU-463:
------------------------------------

Good points, but but don't both cons apply to checksums in CFiles as well?

The difference for the first con being that CFiles already generally process 
data as its being read (decoding dictionary block, decompressing, etc), so 
adding extra logic to handle checksums is not as large a change as changing the 
blocks to handle checksums.

The second point applies equally to checksums in CFiles, right? If the 
checksums are e.g. part of the header, a small block size could cause read 
amplification in the same way.

If checksums were part of the block metadata in the log block manager, it'd 
only be a small %age increase in memory for the 4-byte checksum per block.

Anyway, I'll test out something for checksums in CFiles.

> Add checksumming to cfile and other on-disk formats
> ---------------------------------------------------
>
>                 Key: KUDU-463
>                 URL: https://issues.apache.org/jira/browse/KUDU-463
>             Project: Kudu
>          Issue Type: Sub-task
>          Components: cfile, tablet
>    Affects Versions: Private Beta
>            Reporter: Todd Lipcon
>            Assignee: Adar Dembo
>              Labels: kudu-roadmap
>
> We should add CRC32C checksums to cfile blocks, metadata blocks, etc, to 
> protect against silent disk corruption. We should probably do this prior to a 
> public release, since it will likely have a negative performance impact, and 
> we don't want to have a public regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to