[GitHub] flink pull request #5602: [FLINK-8801][yarn/s3] fix Utils#setupLocalResource...

2018-03-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5602


---


[GitHub] flink pull request #5602: [FLINK-8801][yarn/s3] fix Utils#setupLocalResource...

2018-02-28 Thread NicoK
GitHub user NicoK opened a pull request:

https://github.com/apache/flink/pull/5602

[FLINK-8801][yarn/s3] fix Utils#setupLocalResource() relying on consistent 
read-after-write

## What is the purpose of the change

> Amazon S3 provides read-after-write consistency for PUTS of new objects 
in your
> S3 bucket in all regions with one caveat. The caveat is that if you make 
a HEAD
> or GET request to the key name (to find if the object exists) before 
creating
> the object, Amazon S3 provides eventual consistency for read-after-write."

https://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html#ConsistencyModel

Some S3 file system implementations may actually execute such a request for 
the about-to-write object and thus the read-after-write is only eventually 
consistent. `org.apache.flink.yarn.Utils#setupLocalResource()` currently relies 
on a consistent read-after-write since it accesses the remote resource to get 
file size and modification timestamp. Since there we have access to the local 
resource, we can use this metadata directly instead and circumvent the problem.

Please note that this PR is built upon #5601.

## Brief change log

- do not retrieve the remote object after writing it just for getting file 
statistics
- preserve file modification times for uploaded resources to pass this 
check done by YARN

## Verifying this change

This change is already covered by existing tests, such as 
`YARNSessionCapacitySchedulerITCase` for YARN accepting the uploaded resources 
and `YarnFileStageTestS3ITCase` for the upload path via S3.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): **no**
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: **no**
  - The serializers: **no**
  - The runtime per-record code paths (performance sensitive): **no**
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: **yes**
  - The S3 file system connector: **yes**

## Documentation

  - Does this pull request introduce a new feature? **no**
  - If yes, how is the feature documented? **JavaDocs**


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NicoK/flink flink-8801

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5602.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5602


commit 9570f0d0afe528e2814006a120e7c424b96753d0
Author: Nico Kruber 
Date:   2018-02-27T16:23:20Z

[FLINK-8801][yarn/s3] fix Utils#setupLocalResource() relying on consistent 
read-after-write

"Amazon S3 provides read-after-write consistency for PUTS of new objects in 
your
S3 bucket in all regions with one caveat. The caveat is that if you make a 
HEAD
or GET request to the key name (to find if the object exists) before 
creating
the object, Amazon S3 provides eventual consistency for read-after-write."

https://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html#ConsistencyModel

Some S3 file system implementations may actually execute such a request for 
the
about-to-write object and thus the read-after-write is only eventually
consistent. org.apache.flink.yarn.Utils#setupLocalResource() currently 
relies on
a consistent read-after-write since it accesses the remote resource to get 
file
size and modification timestamp. Since there we have access to the local
resource, we can use this metadata directly instead and circumvent the 
problem.

commit 216d9674a5116ecd0d7d52aedd8126e2b3e12eea
Author: Nico Kruber 
Date:   2018-02-27T16:29:00Z

[FLINK-8336][yarn/s3][tests] harden YarnFileStageTest upload test for 
eventual consistent read-after-write

In case the newly written object cannot be read (yet), we do 4 more retries 
to
retrieve the value and wait 50ms each. While this does not solve all the 
cases
it should make the (rare) case of the written object not being available for
read even more unlikely.




---