[ 
https://issues.apache.org/jira/browse/HDFS-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16409094#comment-16409094
 ] 

Xiao Chen commented on HDFS-13056:
----------------------------------

Thanks a lot for the work here [~dennishuo]! Looks pretty good overall.

A few review comments, mostly cosmetic:
- What's the plan for webhdfs? It seems {{DFSClient#getFileChecksum}} is only 
necessary because of webhdfs.
- {{InterfaceAudience.Private}} is by default {{InterfaceStability.Unstable}}, 
so we can probably remove {{Evolving}} unless there is a reason. We should mark 
the {{LimitedPrivate}} classes to be unstable instead of evolving too.
- The {{StripedBlockChecksumReconstructor}} refactor into the 
prepareDigester/updateDigester/commitDigest looks a little odd to me. But I 
understand if not doing it this way, we'd need to implement 2 mostly-duplicated 
{{reconstruct}}. At the minimum we should check for {{digester != null}} in the 
latter 2 methods.
- CrcUtil: I see {{writeInt}} and {{readInt}} mentions the requirement about 
buffer length v.s. offset, which is good. But for code safety, let's add 
validation to actually check the input parameters.
- CrcUtil: newSingleCrcWrapperFromByteArray: how about naming it to something 
like toSingleCrcString and a {{String}} instead of an {{Object}}? In the caller 
we can also save it to a String variable (and similarly save md5.toString to 
it, within an isDebugEnabled check).
- FileChecksumHelper: {{makeFinalResult}} returns null instead of throw when 
{{combineMode}} is unknown, unlike other places. I think throw is the right 
thing to do.
- FileChecksumHelper: {{tryDatanode}} Let's also log the {{blockChecksumType}} 
in the debug log in the end of the method
- BlockChecksumOptions: trivial, more canonical to call the 2-param ctor from 
the 1-param ctor, like {{this(blockChecksumType, 0);}}
- LOG.isDebugEnabled is not required if using parameterized messages: 
https://www.slf4j.org/faq.html#logging_performance So can remove from 
{{BlockChecksumHelper$BlockGroupNonStripedChecksumComputer#reassembleNonStripedCompositeCrc}}


> Expose file-level composite CRCs in HDFS which are comparable across 
> different instances/layouts
> ------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-13056
>                 URL: https://issues.apache.org/jira/browse/HDFS-13056
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: datanode, distcp, erasure-coding, federation, hdfs
>    Affects Versions: 3.0.0
>            Reporter: Dennis Huo
>            Assignee: Dennis Huo
>            Priority: Major
>         Attachments: HDFS-13056-branch-2.8.001.patch, 
> HDFS-13056-branch-2.8.002.patch, HDFS-13056-branch-2.8.003.patch, 
> HDFS-13056-branch-2.8.004.patch, HDFS-13056-branch-2.8.005.patch, 
> HDFS-13056-branch-2.8.poc1.patch, HDFS-13056.001.patch, HDFS-13056.002.patch, 
> HDFS-13056.003.patch, HDFS-13056.003.patch, HDFS-13056.004.patch, 
> HDFS-13056.005.patch, HDFS-13056.006.patch, HDFS-13056.007.patch, 
> HDFS-13056.008.patch, Reference_only_zhen_PPOC_hadoop2.6.X.diff, 
> hdfs-file-composite-crc32-v1.pdf, hdfs-file-composite-crc32-v2.pdf, 
> hdfs-file-composite-crc32-v3.pdf
>
>
> FileChecksum was first introduced in 
> [https://issues-test.apache.org/jira/browse/HADOOP-3981] and ever since then 
> has remained defined as MD5-of-MD5-of-CRC, where per-512-byte chunk CRCs are 
> already stored as part of datanode metadata, and the MD5 approach is used to 
> compute an aggregate value in a distributed manner, with individual datanodes 
> computing the MD5-of-CRCs per-block in parallel, and the HDFS client 
> computing the second-level MD5.
>  
> A shortcoming of this approach which is often brought up is the fact that 
> this FileChecksum is sensitive to the internal block-size and chunk-size 
> configuration, and thus different HDFS files with different block/chunk 
> settings cannot be compared. More commonly, one might have different HDFS 
> clusters which use different block sizes, in which case any data migration 
> won't be able to use the FileChecksum for distcp's rsync functionality or for 
> verifying end-to-end data integrity (on top of low-level data integrity 
> checks applied at data transfer time).
>  
> This was also revisited in https://issues.apache.org/jira/browse/HDFS-8430 
> during the addition of checksum support for striped erasure-coded files; 
> while there was some discussion of using CRC composability, it still 
> ultimately settled on hierarchical MD5 approach, which also adds the problem 
> that checksums of basic replicated files are not comparable to striped files.
>  
> This feature proposes to add a "COMPOSITE-CRC" FileChecksum type which uses 
> CRC composition to remain completely chunk/block agnostic, and allows 
> comparison between striped vs replicated files, between different HDFS 
> instances, and possible even between HDFS and other external storage systems. 
> This feature can also be added in-place to be compatible with existing block 
> metadata, and doesn't need to change the normal path of chunk verification, 
> so is minimally invasive. This also means even large preexisting HDFS 
> deployments could adopt this feature to retroactively sync data. A detailed 
> design document can be found here: 
> https://storage.googleapis.com/dennishuo/hdfs-file-composite-crc32-v1.pdf



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to