[ 
https://issues.apache.org/jira/browse/KUDU-2526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16581580#comment-16581580
 ] 

Grant Henke commented on KUDU-2526:
-----------------------------------

Looking at how we could implement this it appears the most appropriate place to 
validate would be in _TabletCopyClient::__DownloadBlock_ 
([here|https://github.com/apache/kudu/blob/master/src/kudu/tserver/tablet_copy_client.cc#L703]).

The checksums we have today are at the Cfile header, footer and data block 
(sometimes called page) level. We could read the entire block with the 
CfileReader to validate the existing checksums. But that may be more CPU 
intensive than we would like. We could add/breakout some utility methods to 
validate the checksums while minimally parsing the data to minimize the 
overhead. The header and footer PB will need to be parsed, but we could prevent 
decompressing the data blocks.

Alternatively, we could add a crc32 checksum to the end of each block. That 
checksum could then be validated when the block is finalized. The tricky part 
here is that we don't have any versioning on the block format because it's not 
really a format. In order to support a feature flag and backwards compatibility 
we would likely need to add a magic byte at the start so we can identify when a 
checksum is included. 

I am leaning towards the block checksums, but I was curious if anyone has any 
other ideas or opinions. 

> Checksum and validate blocks on tablet copy
> -------------------------------------------
>
>                 Key: KUDU-2526
>                 URL: https://issues.apache.org/jira/browse/KUDU-2526
>             Project: Kudu
>          Issue Type: Improvement
>          Components: tablet copy
>            Reporter: Grant Henke
>            Priority: Major
>
> In order to prevent viral corruption in the case that a leader has a corrupt 
> CFile, we should checksum (if needed) and verify the blocks while preforming 
> a tablet copy. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to