Ufuk Celebi created FLINK-4228:
----------------------------------
Summary: RocksDB semi-async snapshot to S3AFileSystem fails
Key: FLINK-4228
URL: https://issues.apache.org/jira/browse/FLINK-4228
Project: Flink
Issue Type: Bug
Components: State Backends, Checkpointing
Reporter: Ufuk Celebi
Using the {{RocksDBStateBackend}} with semi-async snapshots (current default)
leads to an Exception when uploading the snapshot to S3 when using the
{{S3AFileSystem}}.
{code}
AsynchronousException{com.amazonaws.AmazonClientException: Unable to calculate
MD5 hash:
/var/folders/_c/5tc5q5q55qjcjtqwlwvwd1m00000gn/T/flink-io-5640e9f1-3ea4-4a0f-b4d9-3ce9fbd98d8a/7c6e745df2dddc6eb70def1240779e44/StreamFlatMap_3_0/dummy_state/47daaf2a-150c-4208-aa4b-409927e9e5b7/local-chk-2886
(Is a directory)}
at
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointThread.run(StreamTask.java:870)
Caused by: com.amazonaws.AmazonClientException: Unable to calculate MD5 hash:
/var/folders/_c/5tc5q5q55qjcjtqwlwvwd1m00000gn/T/flink-io-5640e9f1-3ea4-4a0f-b4d9-3ce9fbd98d8a/7c6e745df2dddc6eb70def1240779e44/StreamFlatMap_3_0/dummy_state/47daaf2a-150c-4208-aa4b-409927e9e5b7/local-chk-2886
(Is a directory)
at
com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1298)
at
com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadInOneChunk(UploadCallable.java:108)
at
com.amazonaws.services.s3.transfer.internal.UploadCallable.call(UploadCallable.java:100)
at
com.amazonaws.services.s3.transfer.internal.UploadMonitor.upload(UploadMonitor.java:192)
at
com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:150)
at
com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:50)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.FileNotFoundException:
/var/folders/_c/5tc5q5q55qjcjtqwlwvwd1m00000gn/T/flink-io-5640e9f1-3ea4-4a0f-b4d9-3ce9fbd98d8a/7c6e745df2dddc6eb70def1240779e44/StreamFlatMap_3_0/dummy_state/47daaf2a-150c-4208-aa4b-409927e9e5b7/local-chk-2886
(Is a directory)
at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.<init>(FileInputStream.java:138)
at
com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1294)
... 9 more
{code}
Running with S3NFileSystem, the error does not occur. The problem might be due
to {{HDFSCopyToLocal}} assuming that sub-folders are going to be created
automatically. We might need to manually create folders and copy only actual
files for {{S3AFileSystem}}. More investigation is required.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)