[ 
https://issues.apache.org/jira/browse/FLINK-8801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812220#comment-16812220
 ] 

Till Rohrmann commented on FLINK-8801:
--------------------------------------

Hi [~Tao Yang] and [~yanyan300300], thanks for reporting this issue. You are 
correct that some of the {{FileSystems}} leave the {{setTimes}} implementation 
to be a no-op. The idea behind this fix was to avoid eventual consistency 
problems. With your proposed change [~yanyan300300] and [~Tao Yang] you might 
run into the problem this issue tried to fix, namely that when reading the 
modification timestamps that the file does not yet exists (depending on the 
file system implementation).

However, the current state does not fully seem to fix the problem. [~NicoK] 
with which file systems did we test this change? Could a workaround be to 
revert this change and add a retry loop if {{FileSystem#getFileStatus}} fails 
with {{FileNotFoundException}}?

> S3's eventual consistent read-after-write may fail yarn deployment of 
> resources to S3
> -------------------------------------------------------------------------------------
>
>                 Key: FLINK-8801
>                 URL: https://issues.apache.org/jira/browse/FLINK-8801
>             Project: Flink
>          Issue Type: Bug
>          Components: Deployment / YARN, FileSystems, Runtime / Coordination
>    Affects Versions: 1.4.0, 1.5.0
>            Reporter: Nico Kruber
>            Assignee: Nico Kruber
>            Priority: Blocker
>             Fix For: 1.4.3, 1.5.0
>
>
> According to 
> https://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html#ConsistencyModel:
> {quote}
> Amazon S3 provides read-after-write consistency for PUTS of new objects in 
> your S3 bucket in all regions with one caveat. The caveat is that if you make 
> a HEAD or GET request to the key name (to find if the object exists) before 
> creating the object, Amazon S3 provides eventual consistency for 
> read-after-write.
> {quote}
> Some S3 file system implementations may actually execute such a request for 
> the about-to-write object and thus the read-after-write is only eventually 
> consistent. {{org.apache.flink.yarn.Utils#setupLocalResource()}} currently 
> relies on a consistent read-after-write since it accesses the remote resource 
> to get file size and modification timestamp. Since there we have access to 
> the local resource, we can use the data from there instead and circumvent the 
> problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to