Steve Loughran created HADOOP-13282:
---------------------------------------
Summary: S3 blob etags to be made visible in
status/getFileChecksum() calls
Key: HADOOP-13282
URL: https://issues.apache.org/jira/browse/HADOOP-13282
Project: Hadoop Common
Issue Type: Sub-task
Components: fs/s3
Reporter: Steve Loughran
Priority: Minor
If the etags of blobs were exported via {{getFileChecksum() }}, it'd be
possible to probe for a blob being in sync with a local file. Distcp could use
this to decide whether to skip a file or not.
Now, there's a problem there: distcp needs source and dest filesystems to
implement the same algorithm. It'd only work out the box if you were copying
between S3 instances. There are also quirks with encryption and multipart: [s3
docs|http://docs.aws.amazon.com/AmazonS3/latest/API/RESTCommonResponseHeaders.html].
At the very least, it's something which could be used when indexing the FS, to
check for changes later.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]