[
https://issues.apache.org/jira/browse/HADOOP-16412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Brahma Reddy Battula updated HADOOP-16412:
------------------------------------------
Target Version/s: 3.4.0 (was: 3.3.0)
Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a
blocker.
> S3a getFileStatus to update DDB if an S3 query returns etag/versionID
> ---------------------------------------------------------------------
>
> Key: HADOOP-16412
> URL: https://issues.apache.org/jira/browse/HADOOP-16412
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.3.0
> Reporter: Steve Loughran
> Priority: Minor
>
> now that S3Guard tables support etags and version IDs, we should do more to
> populate this.
> # listStatus/listFiles doesn't give us all the information; the AWS v1 and v2
> list operations only return the etags
> # a treewalk on import with a HEAD on each object would be expensive and slow
> What we can do is, on a getFileStatus call, update version markers to any
> S3Guard table entry where
> * the etag is already in the S3Guard entry
> * the probe of the store returns an entry with the same etag and a version ID
> In that situation we know the S3 data and S3Guard data are consistent, so
> updating the version ID fills out the data.
> We could also think about updating etags from entries created by older
> versions of S3Guard; it'd be a bit trickier there to decide if the S3 store
> entry was current. Probably safest to leave alone...
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]