[
https://issues.apache.org/jira/browse/FLINK-25200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17479437#comment-17479437
]
Piotr Nowojski commented on FLINK-25200:
----------------------------------------
[~yunta], I'm not sure how much more information would a more realistic test
give us. Yes, one thing not covered by [~akalashnikov]'s test is local IO. But
when re-uploading instead of duplicating file, it's quite likely that the state
file will be already in the file cache for example.
Regardless, after looking at those results, I'm beginning to doubt if it makes
sense to provide native duplicate support for S3. It looks like the performance
cost of both of those operations on the AWS side is the same. I was
hoping/expecting orders of magnitude performance difference in favour of the
CopyObject API.
> Implement duplicating for s3 filesystem
> ---------------------------------------
>
> Key: FLINK-25200
> URL: https://issues.apache.org/jira/browse/FLINK-25200
> Project: Flink
> Issue Type: Sub-task
> Components: FileSystems
> Reporter: Dawid Wysakowicz
> Priority: Major
> Fix For: 1.15.0
>
>
> We can use https://docs.aws.amazon.com/AmazonS3/latest/API/API_CopyObject.html
--
This message was sent by Atlassian Jira
(v8.20.1#820001)