[ 
https://issues.apache.org/jira/browse/CASSANDRA-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814502#comment-17814502
 ] 

Stefan Miklosovic commented on CASSANDRA-19369:
-----------------------------------------------

Why is this actually needed at all? If you write a SSTable, there is DIGEST 
component which computes crc32 of a data file. Are not analytics supporting 
this too? Would not it make more sense to introduce a way how to use different 
checksum algorithms except crc32 for data file integrity validation and then 
reuse it from analytics? 

> [Analytics] Use XXHash32 for digest calculation of SSTables
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-19369
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19369
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Analytics Library
>            Reporter: Francisco Guerrero
>            Assignee: Francisco Guerrero
>            Priority: Normal
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> During bulk writes, Cassandra Analytics calculates the MD5 checksum of every 
> SSTable it produces. During SSTable upload to Cassandra Sidecar, Cassandra 
> Analytics includes the {{content-md5}} header as part of the upload request. 
> This information is used by Cassandra Sidecar to validate the integrity of 
> the uploaded SSTable and prevent issues with bit flips and corrupted SSTables.
> Recently, Cassandra Sidecar introduced [support for additional checksum 
> validations|https://issues.apache.org/jira/browse/CASSANDRASC-97] during 
> SSTable upload. Notably the XXHash32 digest support was added which offers 
> for more performant checksum calculations. This support now allows Cassandra 
> Analytics to use a more efficient digest algorithm that is friendlier on the 
> CPU usage of Sidecar and spark resources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to