[
https://issues.apache.org/jira/browse/FLINK-35232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17840938#comment-17840938
]
Vikas M commented on FLINK-35232:
---------------------------------
Thanks for the reply [~galenwarren].
In this case, we see errors from the `RecoverableWriter support leveraging the
Google Java library directly`.
Sample error stack traces we see in our Flink deployment:
h4. Path 1: FileCommitter.commit ->
GSRecoverableWriterCommitter.commitAfterRecovery
```
h4. com.google.cloud.storage.StorageException: Read timed out
at
com.google.cloud.storage.spi.v1.HttpStorageRpc.translate(HttpStorageRpc.java:233)
at
com.google.cloud.storage.spi.v1.HttpStorageRpc.compose(HttpStorageRpc.java:625)
at
com.google.cloud.storage.StorageImpl.lambda$compose$20(StorageImpl.java:482)
at
com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:105)
at com.google.cloud.RetryHelper.run(RetryHelper.java:76)
at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:50)
at com.google.cloud.storage.Retrying.run(Retrying.java:51)
at com.google.cloud.storage.StorageImpl.run(StorageImpl.java:1374)
at com.google.cloud.storage.StorageImpl.compose(StorageImpl.java:480)
at
org.apache.flink.fs.gs.storage.GSBlobStorageImpl.compose(GSBlobStorageImpl.java:158)
at
org.apache.flink.fs.gs.writer.GSRecoverableWriterCommitter.composeBlobs(GSRecoverableWriterCommitter.java:158)
at
org.apache.flink.fs.gs.writer.GSRecoverableWriterCommitter.writeFinalBlob(GSRecoverableWriterCommitter.java:189)
at
org.apache.flink.fs.gs.writer.GSRecoverableWriterCommitter.commitAfterRecovery(GSRecoverableWriterCommitter.java:109)
at
org.apache.flink.streaming.api.functions.sink.filesystem.OutputStreamBasedPartFileWriter$OutputStreamBasedPendingFile.commitAfterRecovery(OutputStreamBasedPartFileWriter.java:350)
at
org.apache.flink.connector.file.sink.committer.FileCommitter.commit(FileCommitter.java:62)
```
Path 2: FileWriter.write
```
com.google.cloud.storage.StorageException: Read timed out
at
com.google.cloud.storage.spi.v1.HttpStorageRpc.translate(HttpStorageRpc.java:233)
at
com.google.cloud.storage.spi.v1.HttpStorageRpc.open(HttpStorageRpc.java:949)
at
com.google.cloud.storage.ResumableMedia.lambda$null$0(ResumableMedia.java:37)
at
com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:105)
at com.google.cloud.RetryHelper.run(RetryHelper.java:76)
at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:50)
at com.google.cloud.storage.Retrying.run(Retrying.java:51)
at
com.google.cloud.storage.ResumableMedia.lambda$startUploadForBlobInfo$1(ResumableMedia.java:34)
at
com.google.cloud.storage.BlobWriteChannel$Builder.build(BlobWriteChannel.java:315)
at com.google.cloud.storage.StorageImpl.writer(StorageImpl.java:577)
at com.google.cloud.storage.StorageImpl.writer(StorageImpl.java:547)
at com.google.cloud.storage.StorageImpl.writer(StorageImpl.java:95)
at
org.apache.flink.fs.gs.storage.GSBlobStorageImpl.writeBlob(GSBlobStorageImpl.java:64)
at
org.apache.flink.fs.gs.writer.GSRecoverableFsDataOutputStream.createWriteChannel(GSRecoverableFsDataOutputStream.java:229)
at
org.apache.flink.fs.gs.writer.GSRecoverableFsDataOutputStream.write(GSRecoverableFsDataOutputStream.java:152)
at
org.apache.flink.formats.parquet.PositionOutputStreamAdapter.write(PositionOutputStreamAdapter.java:58)
```
Similar to path1, we do see these errors in `FileWriter.prepareCommit` as well..
> Support for retry settings on GCS connector
> -------------------------------------------
>
> Key: FLINK-35232
> URL: https://issues.apache.org/jira/browse/FLINK-35232
> Project: Flink
> Issue Type: Improvement
> Components: Connectors / FileSystem
> Affects Versions: 1.15.3, 1.16.2, 1.17.1, 1.19.0, 1.18.1
> Reporter: Vikas M
> Assignee: Ravi Singh
> Priority: Major
>
> https://issues.apache.org/jira/browse/FLINK-32877 is tracking ability to
> specify transport options in GCS connector. While setting the params enabled
> here reduced read timeouts, we still see 503 errors leading to Flink job
> restarts.
> Thus, in this ticket, we want to specify additional retry settings as noted
> in
> [https://cloud.google.com/storage/docs/retry-strategy#customize-retries.|https://cloud.google.com/storage/docs/retry-strategy#customize-retries]
> We want
> [these|https://cloud.google.com/java/docs/reference/gax/latest/com.google.api.gax.retrying.RetrySettings#methods]
> methods available for Flink users so that they can customize their
> deployment.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)