[
https://issues.apache.org/jira/browse/HADOOP-15273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Loughran updated HADOOP-15273:
------------------------------------
Description:
When using distcp without {{-skipcrcchecks}} . If there's a checksum mismatch
between src and dest store types (e.g hdfs to s3), then the error message will
talk about blocksize, even when its the underlying checksum protocol itself
which is the cause for failure
bq. Source and target differ in block-size. Use -pb to preserve block-sizes
during copy. Alternatively, skip checksum-checks altogether, using -skipCrc.
(NOTE: By skipping checksums, one runs the risk of masking data-corruption
during file-transfer.)
update: the CRC check takes always place on a distcp upload before the file is
renamed into place. *and you can't disable it then*
was:
When using distcp without {{-skipCRC}} . If there's a checksum mismatch between
src and dest store types (e.g hdfs to s3), then the error message will talk
about blocksize, even when its the underlying checksum protocol itself which is
the cause for failure
bq. Source and target differ in block-size. Use -pb to preserve block-sizes
during copy. Alternatively, skip checksum-checks altogether, using -skipCrc.
(NOTE: By skipping checksums, one runs the risk of masking data-corruption
during file-transfer.)
IF the checksum types are fundamentally different, the error message should say
so
> distcp can't handle remote stores with different checksum algorithms
> --------------------------------------------------------------------
>
> Key: HADOOP-15273
> URL: https://issues.apache.org/jira/browse/HADOOP-15273
> Project: Hadoop Common
> Issue Type: Bug
> Components: tools/distcp
> Affects Versions: 3.1.0
> Reporter: Steve Loughran
> Priority: Critical
>
> When using distcp without {{-skipcrcchecks}} . If there's a checksum mismatch
> between src and dest store types (e.g hdfs to s3), then the error message
> will talk about blocksize, even when its the underlying checksum protocol
> itself which is the cause for failure
> bq. Source and target differ in block-size. Use -pb to preserve block-sizes
> during copy. Alternatively, skip checksum-checks altogether, using -skipCrc.
> (NOTE: By skipping checksums, one runs the risk of masking data-corruption
> during file-transfer.)
> update: the CRC check takes always place on a distcp upload before the file
> is renamed into place. *and you can't disable it then*
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]