[ 
https://issues.apache.org/jira/browse/HADOOP-18672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17703164#comment-17703164
 ] 

Steve Loughran commented on HADOOP-18672:
-----------------------------------------

azure storage doesn't have checksums

it does have etags. while distcp can't cope with that, you are welcome to 
implement your own equivalent which logs the source checksum and dest etag, and 
only update when changed. this'd also work for s3, any other store implementing 
EtagSource on their filestatus

s3a can export etags as its checksum, but as it's not compatible with distcp, 
it just broke all jobs without -skipCrc check. that's why its disabled. abfs 
would be the same unless, like gcs, azure storage added a compatible checksum

> ask: abfs connector to support checksum
> ---------------------------------------
>
>                 Key: HADOOP-18672
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18672
>             Project: Hadoop Common
>          Issue Type: Wish
>          Components: fs/azure
>            Reporter: Wei-Hsiang Lin
>            Priority: Major
>
> Hi Hadoop-Azure community,
> I cannot find much information on reason why abfs connector file level 
> checksum is not supported, could you share some insights on why it doesn't 
> support and is there plan to support in the future ? 
> having this would be helpful for migrating data from on-prem to Azure storage 
> using abfs connector
> ref https://hadoop.apache.org/docs/stable/hadoop-azure/abfs.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to