[jira] [Commented] (HADOOP-18564) Use file-level checksum by default when copying between two different file systems

Steve Loughran (Jira) Fri, 09 Dec 2022 03:35:18 -0800


    [ 
https://issues.apache.org/jira/browse/HADOOP-18564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17645253#comment-17645253
 ]


Steve Loughran commented on HADOOP-18564:
-----------------------------------------

FWIW there's an etag source api which abfs and s3a use to return etags in 
listings/get filestatus; both offer the option "fs.capability.etags.available" 
when available, and "fs.capability.etags.preserved.in.rename" when rename 
preserves tags (adls gen 2, not s3 or wasb)

google gcs has done some work to try and offer a consistent checksum w/ hdfs

the whole idea of using source dist checksums to see if something is updated 
isn't sustainable; it doesn't work across encryption zones (correct?) or 
between stores. you really need to store (where?) the list of (source checksum, 
dest checksum) values after the copy, and then recognise that if they 
unchanged, the dest file is up to date.

oh, and the only reason s3a doesn't offer checksums by default is that turning 
it on broke too many existing distcp jobs. if distcp wasn't so brittle about 
comparisons then it would have been left on.

anyway
* -1 to changing the default
* +1 to making that default less brittle
* +1 to any new option for different comparisons, like the option you've 
specified. makes sure the logs note it somewhere.

> Use file-level checksum by default when copying between two different file 
> systems
> ----------------------------------------------------------------------------------
>
>                 Key: HADOOP-18564
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18564
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Siyao Meng
>            Priority: Major
>              Labels: DistCp
>
> h2. Goal
> Reduce user friction
> h2. Background
> When distcp'ing between two different file systems, distcp still uses 
> block-level checksum by default, even though the two file systems can be very 
> different in how they manage blocks, so that a block-level checksum no longer 
> makes sense between these two.
> e.g. distcp between HDFS and Ozone without overriding 
> {{dfs.checksum.combine.mode}} throws IOException because the blocks of the 
> same file on two FSes are different (as expected):
> {code}
> $ hadoop distcp -i -pp /test o3fs://buck-test1.vol1.ozone1/
> java.lang.Exception: java.io.IOException: File copy failed: 
> hdfs://duong-1.duong.root.hwx.site:8020/test/test.bin --> 
> o3fs://buck-test1.vol1.ozone1/test/test.bin
>       at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
>       at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
> Caused by: java.io.IOException: File copy failed: 
> hdfs://duong-1.duong.root.hwx.site:8020/test/test.bin --> 
> o3fs://buck-test1.vol1.ozone1/test/test.bin
>       at 
> org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:262)
>       at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:219)
>       at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:48)
>       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
>       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>       at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Couldn't run retriable-command: Copying 
> hdfs://duong-1.duong.root.hwx.site:8020/test/test.bin to 
> o3fs://buck-test1.vol1.ozone1/test/test.bin
>       at 
> org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:101)
>       at 
> org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:258)
>       ... 11 more
> Caused by: java.io.IOException: Checksum mismatch between 
> hdfs://duong-1.duong.root.hwx.site:8020/test/test.bin and 
> o3fs://buck-test1.vol1.ozone1/.distcp.tmp.attempt_local1346550241_0001_m_000000_0.Source
>  and destination filesystems are of different types
> Their checksum algorithms may be incompatible You can choose file-level 
> checksum validation via -Ddfs.checksum.combine.mode=COMPOSITE_CRC when 
> block-sizes or filesystems are different. Or you can skip checksum-checks 
> altogether  with -skipcrccheck.
> {code}
> And it works when we use a file-level checksum like {{COMPOSITE_CRC}}:
> {code:title=With -Ddfs.checksum.combine.mode=COMPOSITE_CRC}
> $ hadoop distcp -i -pp /test o3fs://buck-test2.vol1.ozone1/ 
> -Ddfs.checksum.combine.mode=COMPOSITE_CRC
> 22/10/18 19:07:42 INFO mapreduce.Job: Job job_local386071499_0001 completed 
> successfully
> 22/10/18 19:07:42 INFO mapreduce.Job: Counters: 30
>       File System Counters
>               FILE: Number of bytes read=219900
>               FILE: Number of bytes written=794129
>               FILE: Number of read operations=0
>               FILE: Number of large read operations=0
>               FILE: Number of write operations=0
>               HDFS: Number of bytes read=0
>               HDFS: Number of bytes written=0
>               HDFS: Number of read operations=13
>               HDFS: Number of large read operations=0
>               HDFS: Number of write operations=2
>               HDFS: Number of bytes read erasure-coded=0
>               O3FS: Number of bytes read=0
>               O3FS: Number of bytes written=0
>               O3FS: Number of read operations=5
>               O3FS: Number of large read operations=0
>               O3FS: Number of write operations=0
> ..
> {code}
> h2. Alternative
> (if changing global defaults could potentially break distcp'ing between 
> HDFS/S3/etc. Also [~weichiu] mentioned COMPOSITE_CRC is only added in Hadoop 
> 3.1.1. So this might be the only way.)
> Don't touch the global default, and make it a client-side config.
> e.g. add a config to allow automatically usage of COMPOSITE_CRC 
> (dfs.checksum.combine.mode) when distcp'ing between HDFS and Ozone, which 
> would be the equivalent of specifying 
> {{-Ddfs.checksum.combine.mode=COMPOSITE_CRC}} on the distcp command but the 
> end user won't have to specify it every single time.
> cc [~duongnguyen] [~weichiu]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-18564) Use file-level checksum by default when copying between two different file systems

Reply via email to