[jira] [Commented] (FLINK-11187) StreamingFileSink with S3 backend transient socket timeout issues
[ https://issues.apache.org/jira/browse/FLINK-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743973#comment-16743973 ] Aljoscha Krettek commented on FLINK-11187: -- Fixed on release-1.7 in d5506b9191f0af0346e10561df783815e7cd0565 > StreamingFileSink with S3 backend transient socket timeout issues > -- > > Key: FLINK-11187 > URL: https://issues.apache.org/jira/browse/FLINK-11187 > Project: Flink > Issue Type: Bug > Components: FileSystem, Streaming Connectors >Affects Versions: 1.7.0, 1.7.1 >Reporter: Addison Higham >Assignee: Addison Higham >Priority: Major > Labels: pull-request-available > Fix For: 1.7.2, 1.8.0 > > Time Spent: 10m > Remaining Estimate: 0h > > When using the StreamingFileSink with S3A backend, occasionally, errors like > this will occur: > {noformat} > Caused by: > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: > Your socket connection to the server was not read from or written to within > the timeout period. Idle connections will be closed. (Service: Amazon S3; > Status Code: 400; Error Code: RequestTimeout; Request ID: xxx; S3 Extended > Request ID: xxx, S3 Extended Request ID: xxx > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056){noformat} > This causes a restart of flink job, which is often able to recover from, but > under heavy load, this can become very frequent. > > Turning on debug logs you can find the following relevant stack trace: > {noformat} > 2018-12-17 05:55:46,546 DEBUG > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient - FYI: > failed to reset content inputstream before throwing up > java.io.IOException: Resetting to invalid mark > at java.io.BufferedInputStream.reset(BufferedInputStream.java:448) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkBufferedInputStream.reset(SdkBufferedInputStream.java:106) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:168) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.lastReset(AmazonHttpClient.java:1145) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1070) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:3306) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3291) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.uploadPart(S3AFileSystem.java:1576) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.WriteOperationHelper.lambda$uploadPart$8(WriteOperationHelper.java:474) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:260) > at >
[jira] [Commented] (FLINK-11187) StreamingFileSink with S3 backend transient socket timeout issues
[ https://issues.apache.org/jira/browse/FLINK-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743882#comment-16743882 ] Aljoscha Krettek commented on FLINK-11187: -- Fixed on master in 9c07822acedfd7e0582481de620e5af205f4aef3 > StreamingFileSink with S3 backend transient socket timeout issues > -- > > Key: FLINK-11187 > URL: https://issues.apache.org/jira/browse/FLINK-11187 > Project: Flink > Issue Type: Bug > Components: FileSystem, Streaming Connectors >Affects Versions: 1.7.0, 1.7.1 >Reporter: Addison Higham >Assignee: Addison Higham >Priority: Major > Labels: pull-request-available > Fix For: 1.7.2, 1.8.0 > > Time Spent: 10m > Remaining Estimate: 0h > > When using the StreamingFileSink with S3A backend, occasionally, errors like > this will occur: > {noformat} > Caused by: > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: > Your socket connection to the server was not read from or written to within > the timeout period. Idle connections will be closed. (Service: Amazon S3; > Status Code: 400; Error Code: RequestTimeout; Request ID: xxx; S3 Extended > Request ID: xxx, S3 Extended Request ID: xxx > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056){noformat} > This causes a restart of flink job, which is often able to recover from, but > under heavy load, this can become very frequent. > > Turning on debug logs you can find the following relevant stack trace: > {noformat} > 2018-12-17 05:55:46,546 DEBUG > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient - FYI: > failed to reset content inputstream before throwing up > java.io.IOException: Resetting to invalid mark > at java.io.BufferedInputStream.reset(BufferedInputStream.java:448) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkBufferedInputStream.reset(SdkBufferedInputStream.java:106) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:168) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.lastReset(AmazonHttpClient.java:1145) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1070) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:3306) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3291) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.uploadPart(S3AFileSystem.java:1576) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.WriteOperationHelper.lambda$uploadPart$8(WriteOperationHelper.java:474) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:260) > at >
[jira] [Commented] (FLINK-11187) StreamingFileSink with S3 backend transient socket timeout issues
[ https://issues.apache.org/jira/browse/FLINK-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743769#comment-16743769 ] Kostas Kloudas commented on FLINK-11187: Thanks a lot for the help [~sthm] and for the work of course [~addis...@gmail.com]! We will merge probably today. > StreamingFileSink with S3 backend transient socket timeout issues > -- > > Key: FLINK-11187 > URL: https://issues.apache.org/jira/browse/FLINK-11187 > Project: Flink > Issue Type: Bug > Components: FileSystem, Streaming Connectors >Affects Versions: 1.7.0, 1.7.1 >Reporter: Addison Higham >Assignee: Addison Higham >Priority: Major > Labels: pull-request-available > Fix For: 1.7.2, 1.8.0 > > Time Spent: 10m > Remaining Estimate: 0h > > When using the StreamingFileSink with S3A backend, occasionally, errors like > this will occur: > {noformat} > Caused by: > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: > Your socket connection to the server was not read from or written to within > the timeout period. Idle connections will be closed. (Service: Amazon S3; > Status Code: 400; Error Code: RequestTimeout; Request ID: xxx; S3 Extended > Request ID: xxx, S3 Extended Request ID: xxx > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056){noformat} > This causes a restart of flink job, which is often able to recover from, but > under heavy load, this can become very frequent. > > Turning on debug logs you can find the following relevant stack trace: > {noformat} > 2018-12-17 05:55:46,546 DEBUG > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient - FYI: > failed to reset content inputstream before throwing up > java.io.IOException: Resetting to invalid mark > at java.io.BufferedInputStream.reset(BufferedInputStream.java:448) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkBufferedInputStream.reset(SdkBufferedInputStream.java:106) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:168) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.lastReset(AmazonHttpClient.java:1145) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1070) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:3306) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3291) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.uploadPart(S3AFileSystem.java:1576) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.WriteOperationHelper.lambda$uploadPart$8(WriteOperationHelper.java:474) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:260) > at >
[jira] [Commented] (FLINK-11187) StreamingFileSink with S3 backend transient socket timeout issues
[ https://issues.apache.org/jira/browse/FLINK-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743755#comment-16743755 ] Steffen Hausmann commented on FLINK-11187: -- I've kept the job running for several days and haven't seen the exception since. So I'm pretty sure that the fix properly addresses the bug. But as previously mentioned, providing a minimal example is quite challenging, as for my workload, the exception occurred roughly every 4-8 hours. > StreamingFileSink with S3 backend transient socket timeout issues > -- > > Key: FLINK-11187 > URL: https://issues.apache.org/jira/browse/FLINK-11187 > Project: Flink > Issue Type: Bug > Components: FileSystem, Streaming Connectors >Affects Versions: 1.7.0, 1.7.1 >Reporter: Addison Higham >Assignee: Addison Higham >Priority: Major > Labels: pull-request-available > Fix For: 1.7.2, 1.8.0 > > Time Spent: 10m > Remaining Estimate: 0h > > When using the StreamingFileSink with S3A backend, occasionally, errors like > this will occur: > {noformat} > Caused by: > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: > Your socket connection to the server was not read from or written to within > the timeout period. Idle connections will be closed. (Service: Amazon S3; > Status Code: 400; Error Code: RequestTimeout; Request ID: xxx; S3 Extended > Request ID: xxx, S3 Extended Request ID: xxx > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056){noformat} > This causes a restart of flink job, which is often able to recover from, but > under heavy load, this can become very frequent. > > Turning on debug logs you can find the following relevant stack trace: > {noformat} > 2018-12-17 05:55:46,546 DEBUG > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient - FYI: > failed to reset content inputstream before throwing up > java.io.IOException: Resetting to invalid mark > at java.io.BufferedInputStream.reset(BufferedInputStream.java:448) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkBufferedInputStream.reset(SdkBufferedInputStream.java:106) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:168) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.lastReset(AmazonHttpClient.java:1145) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1070) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:3306) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3291) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.uploadPart(S3AFileSystem.java:1576) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.WriteOperationHelper.lambda$uploadPart$8(WriteOperationHelper.java:474) > at >
[jira] [Commented] (FLINK-11187) StreamingFileSink with S3 backend transient socket timeout issues
[ https://issues.apache.org/jira/browse/FLINK-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16739662#comment-16739662 ] Addison Higham commented on FLINK-11187: PR opened here: [https://github.com/apache/flink/pull/7460] [~aljoscha] not in runtime code, but it is still used in unit tests to read back objects written. But yes, the name is now a bit of a misnomer, but IDK how extensive this refactor should be... > StreamingFileSink with S3 backend transient socket timeout issues > -- > > Key: FLINK-11187 > URL: https://issues.apache.org/jira/browse/FLINK-11187 > Project: Flink > Issue Type: Bug > Components: FileSystem, Streaming Connectors >Affects Versions: 1.7.0, 1.7.1 >Reporter: Addison Higham >Assignee: Addison Higham >Priority: Major > Labels: pull-request-available > Fix For: 1.7.2, 1.8.0 > > Time Spent: 10m > Remaining Estimate: 0h > > When using the StreamingFileSink with S3A backend, occasionally, errors like > this will occur: > {noformat} > Caused by: > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: > Your socket connection to the server was not read from or written to within > the timeout period. Idle connections will be closed. (Service: Amazon S3; > Status Code: 400; Error Code: RequestTimeout; Request ID: xxx; S3 Extended > Request ID: xxx, S3 Extended Request ID: xxx > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056){noformat} > This causes a restart of flink job, which is often able to recover from, but > under heavy load, this can become very frequent. > > Turning on debug logs you can find the following relevant stack trace: > {noformat} > 2018-12-17 05:55:46,546 DEBUG > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient - FYI: > failed to reset content inputstream before throwing up > java.io.IOException: Resetting to invalid mark > at java.io.BufferedInputStream.reset(BufferedInputStream.java:448) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkBufferedInputStream.reset(SdkBufferedInputStream.java:106) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:168) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.lastReset(AmazonHttpClient.java:1145) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1070) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:3306) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3291) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.uploadPart(S3AFileSystem.java:1576) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.WriteOperationHelper.lambda$uploadPart$8(WriteOperationHelper.java:474) > at >
[jira] [Commented] (FLINK-11187) StreamingFileSink with S3 backend transient socket timeout issues
[ https://issues.apache.org/jira/browse/FLINK-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738224#comment-16738224 ] Aljoscha Krettek commented on FLINK-11187: -- After the fix, does this even need the stream nature of {{RefCountedBufferingFileStream}} anymore? > StreamingFileSink with S3 backend transient socket timeout issues > -- > > Key: FLINK-11187 > URL: https://issues.apache.org/jira/browse/FLINK-11187 > Project: Flink > Issue Type: Bug > Components: FileSystem, Streaming Connectors >Affects Versions: 1.7.0, 1.7.1 >Reporter: Addison Higham >Assignee: Addison Higham >Priority: Major > Fix For: 1.7.2, 1.8.0 > > > When using the StreamingFileSink with S3A backend, occasionally, errors like > this will occur: > {noformat} > Caused by: > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: > Your socket connection to the server was not read from or written to within > the timeout period. Idle connections will be closed. (Service: Amazon S3; > Status Code: 400; Error Code: RequestTimeout; Request ID: xxx; S3 Extended > Request ID: xxx, S3 Extended Request ID: xxx > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056){noformat} > This causes a restart of flink job, which is often able to recover from, but > under heavy load, this can become very frequent. > > Turning on debug logs you can find the following relevant stack trace: > {noformat} > 2018-12-17 05:55:46,546 DEBUG > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient - FYI: > failed to reset content inputstream before throwing up > java.io.IOException: Resetting to invalid mark > at java.io.BufferedInputStream.reset(BufferedInputStream.java:448) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkBufferedInputStream.reset(SdkBufferedInputStream.java:106) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:168) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.lastReset(AmazonHttpClient.java:1145) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1070) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:3306) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3291) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.uploadPart(S3AFileSystem.java:1576) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.WriteOperationHelper.lambda$uploadPart$8(WriteOperationHelper.java:474) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:260) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:317) > at >
[jira] [Commented] (FLINK-11187) StreamingFileSink with S3 backend transient socket timeout issues
[ https://issues.apache.org/jira/browse/FLINK-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738131#comment-16738131 ] Steve Loughran commented on FLINK-11187: bq. As far as I can tell, this is just a transient s3 error, so it appears to be pretty rare but does happen more often with increased load. its believed to be a network error: upload of a single block fails, transfer manager wants to retry but the block buffering is such that it can't repost that block. AFAIK. no easy way to test this other than a long-haul upload over many hours.Not unless you can insert an unreliable proxy or sneak in an httpclient which fails (possible if you have your own modified apache httpclient, run the SDK unshaded, etc...) BTW: do make sure that final POST does retries too; some of the AWS SDKs have under-retried there, and you don't want a 4h upload to fail at the final step. > StreamingFileSink with S3 backend transient socket timeout issues > -- > > Key: FLINK-11187 > URL: https://issues.apache.org/jira/browse/FLINK-11187 > Project: Flink > Issue Type: Bug > Components: FileSystem, Streaming Connectors >Affects Versions: 1.7.0, 1.7.1 >Reporter: Addison Higham >Assignee: Addison Higham >Priority: Major > Fix For: 1.7.2, 1.8.0 > > > When using the StreamingFileSink with S3A backend, occasionally, errors like > this will occur: > {noformat} > Caused by: > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: > Your socket connection to the server was not read from or written to within > the timeout period. Idle connections will be closed. (Service: Amazon S3; > Status Code: 400; Error Code: RequestTimeout; Request ID: xxx; S3 Extended > Request ID: xxx, S3 Extended Request ID: xxx > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056){noformat} > This causes a restart of flink job, which is often able to recover from, but > under heavy load, this can become very frequent. > > Turning on debug logs you can find the following relevant stack trace: > {noformat} > 2018-12-17 05:55:46,546 DEBUG > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient - FYI: > failed to reset content inputstream before throwing up > java.io.IOException: Resetting to invalid mark > at java.io.BufferedInputStream.reset(BufferedInputStream.java:448) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkBufferedInputStream.reset(SdkBufferedInputStream.java:106) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:168) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.lastReset(AmazonHttpClient.java:1145) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1070) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:3306) > at >
[jira] [Commented] (FLINK-11187) StreamingFileSink with S3 backend transient socket timeout issues
[ https://issues.apache.org/jira/browse/FLINK-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738029#comment-16738029 ] Kostas Kloudas commented on FLINK-11187: Hi [~addis...@gmail.com], thanks for working on that. I understand that reproducing transient network failures between yours and a system you cannot control (s3) is difficult. This is why I asked if you have an idea about how to add a minimum example that is reproducible and always fails. Using minio was also discussed internally when we were writing the s3 connector, but we decided to go with using s3 instead. Maybe we should reconsider, at least for some tests like this. In any case, feel free to open a PR with your changes (even without a test for now) and I will review it as soon as possible. > StreamingFileSink with S3 backend transient socket timeout issues > -- > > Key: FLINK-11187 > URL: https://issues.apache.org/jira/browse/FLINK-11187 > Project: Flink > Issue Type: Bug > Components: FileSystem, Streaming Connectors >Affects Versions: 1.7.0, 1.7.1 >Reporter: Addison Higham >Assignee: Addison Higham >Priority: Major > Fix For: 1.7.2, 1.8.0 > > > When using the StreamingFileSink with S3A backend, occasionally, errors like > this will occur: > {noformat} > Caused by: > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: > Your socket connection to the server was not read from or written to within > the timeout period. Idle connections will be closed. (Service: Amazon S3; > Status Code: 400; Error Code: RequestTimeout; Request ID: xxx; S3 Extended > Request ID: xxx, S3 Extended Request ID: xxx > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056){noformat} > This causes a restart of flink job, which is often able to recover from, but > under heavy load, this can become very frequent. > > Turning on debug logs you can find the following relevant stack trace: > {noformat} > 2018-12-17 05:55:46,546 DEBUG > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient - FYI: > failed to reset content inputstream before throwing up > java.io.IOException: Resetting to invalid mark > at java.io.BufferedInputStream.reset(BufferedInputStream.java:448) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkBufferedInputStream.reset(SdkBufferedInputStream.java:106) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:168) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.lastReset(AmazonHttpClient.java:1145) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1070) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:3306) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3291) > at >
[jira] [Commented] (FLINK-11187) StreamingFileSink with S3 backend transient socket timeout issues
[ https://issues.apache.org/jira/browse/FLINK-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737665#comment-16737665 ] Addison Higham commented on FLINK-11187: Hey [~kkl0u], I took a crack at making the relevant changes here: [https://github.com/instructure/flink/tree/s3_file] Additionally, I don't think it should take anything special to cause a repro, but things that should help based on what I understand of the bug: - Creating larger files so you have s3 multipart parts (i.e. don't create new flink parts as often) - hitting it from quite a few servers, we only saw this behavior once we had 20+ servers, parallelism of 100+ and even then it was only once every few hours As far as I can tell, this is just a transient s3 error, so it appears to be pretty rare but does happen more often with increased load. I made some tweaks to the the flink-streaming-file-sink e2e test (here: [https://github.com/instructure/flink/tree/s3_repro)] and was trying to force an s3 failure by using tcpkill to kill a connection, but haven't been able to repro it yet. I think a better approach might be to using something like minio (an alternate s3 implementation) and kill it temporarily to force a failed part upload. > StreamingFileSink with S3 backend transient socket timeout issues > -- > > Key: FLINK-11187 > URL: https://issues.apache.org/jira/browse/FLINK-11187 > Project: Flink > Issue Type: Bug > Components: FileSystem, Streaming Connectors >Affects Versions: 1.7.0, 1.7.1 >Reporter: Addison Higham >Assignee: Addison Higham >Priority: Major > Fix For: 1.7.2, 1.8.0 > > > When using the StreamingFileSink with S3A backend, occasionally, errors like > this will occur: > {noformat} > Caused by: > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: > Your socket connection to the server was not read from or written to within > the timeout period. Idle connections will be closed. (Service: Amazon S3; > Status Code: 400; Error Code: RequestTimeout; Request ID: xxx; S3 Extended > Request ID: xxx, S3 Extended Request ID: xxx > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056){noformat} > This causes a restart of flink job, which is often able to recover from, but > under heavy load, this can become very frequent. > > Turning on debug logs you can find the following relevant stack trace: > {noformat} > 2018-12-17 05:55:46,546 DEBUG > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient - FYI: > failed to reset content inputstream before throwing up > java.io.IOException: Resetting to invalid mark > at java.io.BufferedInputStream.reset(BufferedInputStream.java:448) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkBufferedInputStream.reset(SdkBufferedInputStream.java:106) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:168) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.lastReset(AmazonHttpClient.java:1145) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1070) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325) >
[jira] [Commented] (FLINK-11187) StreamingFileSink with S3 backend transient socket timeout issues
[ https://issues.apache.org/jira/browse/FLINK-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736889#comment-16736889 ] Kostas Kloudas commented on FLINK-11187: Thanks for checking this [~sthm] ! It would be really helpful if you could also provide a minimal example that we can use for a test. And of course, let us know if these changes solve the problem or not. > StreamingFileSink with S3 backend transient socket timeout issues > -- > > Key: FLINK-11187 > URL: https://issues.apache.org/jira/browse/FLINK-11187 > Project: Flink > Issue Type: Bug > Components: FileSystem, Streaming Connectors >Affects Versions: 1.7.0, 1.7.1 >Reporter: Addison Higham >Assignee: Addison Higham >Priority: Major > Fix For: 1.7.2, 1.8.0 > > > When using the StreamingFileSink with S3A backend, occasionally, errors like > this will occur: > {noformat} > Caused by: > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: > Your socket connection to the server was not read from or written to within > the timeout period. Idle connections will be closed. (Service: Amazon S3; > Status Code: 400; Error Code: RequestTimeout; Request ID: xxx; S3 Extended > Request ID: xxx, S3 Extended Request ID: xxx > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056){noformat} > This causes a restart of flink job, which is often able to recover from, but > under heavy load, this can become very frequent. > > Turning on debug logs you can find the following relevant stack trace: > {noformat} > 2018-12-17 05:55:46,546 DEBUG > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient - FYI: > failed to reset content inputstream before throwing up > java.io.IOException: Resetting to invalid mark > at java.io.BufferedInputStream.reset(BufferedInputStream.java:448) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkBufferedInputStream.reset(SdkBufferedInputStream.java:106) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:168) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.lastReset(AmazonHttpClient.java:1145) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1070) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:3306) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3291) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.uploadPart(S3AFileSystem.java:1576) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.WriteOperationHelper.lambda$uploadPart$8(WriteOperationHelper.java:474) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:260) > at >
[jira] [Commented] (FLINK-11187) StreamingFileSink with S3 backend transient socket timeout issues
[ https://issues.apache.org/jira/browse/FLINK-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736855#comment-16736855 ] Steffen Hausmann commented on FLINK-11187: -- Thanks for looking into this, [~kkl0u] and [~addis...@gmail.com]. I've applied the changes proposed by Kostas on top of the 1.7.1 release. I ran the same job mentioned in the email thread for five hours and haven't seen the error during that time. This is a very promising sing, but I'll keep my job running to verify that the error occurs at a later point and let you know what I find. > StreamingFileSink with S3 backend transient socket timeout issues > -- > > Key: FLINK-11187 > URL: https://issues.apache.org/jira/browse/FLINK-11187 > Project: Flink > Issue Type: Bug > Components: FileSystem, Streaming Connectors >Affects Versions: 1.7.0, 1.7.1 >Reporter: Addison Higham >Assignee: Addison Higham >Priority: Major > Fix For: 1.7.2, 1.8.0 > > > When using the StreamingFileSink with S3A backend, occasionally, errors like > this will occur: > {noformat} > Caused by: > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: > Your socket connection to the server was not read from or written to within > the timeout period. Idle connections will be closed. (Service: Amazon S3; > Status Code: 400; Error Code: RequestTimeout; Request ID: xxx; S3 Extended > Request ID: xxx, S3 Extended Request ID: xxx > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056){noformat} > This causes a restart of flink job, which is often able to recover from, but > under heavy load, this can become very frequent. > > Turning on debug logs you can find the following relevant stack trace: > {noformat} > 2018-12-17 05:55:46,546 DEBUG > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient - FYI: > failed to reset content inputstream before throwing up > java.io.IOException: Resetting to invalid mark > at java.io.BufferedInputStream.reset(BufferedInputStream.java:448) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkBufferedInputStream.reset(SdkBufferedInputStream.java:106) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:168) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.lastReset(AmazonHttpClient.java:1145) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1070) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:3306) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3291) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.uploadPart(S3AFileSystem.java:1576) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.WriteOperationHelper.lambda$uploadPart$8(WriteOperationHelper.java:474) > at >
[jira] [Commented] (FLINK-11187) StreamingFileSink with S3 backend transient socket timeout issues
[ https://issues.apache.org/jira/browse/FLINK-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735999#comment-16735999 ] Kostas Kloudas commented on FLINK-11187: Hi [~addis...@gmail.com] and [~ste...@apache.org]. Thanks for looking into this and for digging into the code. >From what you describe, I would lean towards the solution that exposes the >{{File}} in the {{RefCountedFSOutputStream}} and go for the second solution >proposed by [~addis...@gmail.com]. For uniformity, also the {{putObject()}} should change to take a {{File}} as an argument (and get rid of the {{length}} argument. So effectively, the two methods in the {{HadoopS3AccessHelper}} will become: {code:java} @Override public UploadPartResult uploadPart(String key, String uploadId, int partNumber, File inputFile, long length) throws IOException { final UploadPartRequest uploadRequest = s3accessHelper.newUploadPartRequest( key, uploadId, partNumber, MathUtils.checkedDownCast(length), null, inputFile, 0L); return s3accessHelper.uploadPart(uploadRequest); } @Override public PutObjectResult putObject(String key, File inputFile) throws IOException { final PutObjectRequest putRequest = s3accessHelper.createPutObjectRequest(key, inputFile); return s3accessHelper.putObject(putRequest); } {code} [~addis...@gmail.com] Do you have a minimal example that verifies that this solution solves the problem? > StreamingFileSink with S3 backend transient socket timeout issues > -- > > Key: FLINK-11187 > URL: https://issues.apache.org/jira/browse/FLINK-11187 > Project: Flink > Issue Type: Bug > Components: FileSystem, Streaming Connectors >Affects Versions: 1.7.0, 1.7.1 >Reporter: Addison Higham >Assignee: Addison Higham >Priority: Major > Fix For: 1.7.2, 1.8.0 > > > When using the StreamingFileSink with S3A backend, occasionally, errors like > this will occur: > {noformat} > Caused by: > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: > Your socket connection to the server was not read from or written to within > the timeout period. Idle connections will be closed. (Service: Amazon S3; > Status Code: 400; Error Code: RequestTimeout; Request ID: xxx; S3 Extended > Request ID: xxx, S3 Extended Request ID: xxx > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056){noformat} > This causes a restart of flink job, which is often able to recover from, but > under heavy load, this can become very frequent. > > Turning on debug logs you can find the following relevant stack trace: > {noformat} > 2018-12-17 05:55:46,546 DEBUG > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient - FYI: > failed to reset content inputstream before throwing up > java.io.IOException: Resetting to invalid mark > at java.io.BufferedInputStream.reset(BufferedInputStream.java:448) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkBufferedInputStream.reset(SdkBufferedInputStream.java:106) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:168) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.lastReset(AmazonHttpClient.java:1145) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1070) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649) > at >
[jira] [Commented] (FLINK-11187) StreamingFileSink with S3 backend transient socket timeout issues
[ https://issues.apache.org/jira/browse/FLINK-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725942#comment-16725942 ] Steve Loughran commented on FLINK-11187: There's limits to how well you can buffer...if the post fails it goes back to byte 0 and retries, which is done with a reset() to the marker set when uploading that part began. Better to serve up directly from a file, if that's the source, or, if its memory, having an input stream which [supports mark/reset to anywhere in the entire block|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3ADataBlocks.java#L640] > StreamingFileSink with S3 backend transient socket timeout issues > -- > > Key: FLINK-11187 > URL: https://issues.apache.org/jira/browse/FLINK-11187 > Project: Flink > Issue Type: Bug > Components: FileSystem, Streaming Connectors >Affects Versions: 1.7.0, 1.7.1 >Reporter: Addison Higham >Assignee: Addison Higham >Priority: Major > Fix For: 1.7.2 > > > When using the StreamingFileSink with S3A backend, occasionally, errors like > this will occur: > {noformat} > Caused by: > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: > Your socket connection to the server was not read from or written to within > the timeout period. Idle connections will be closed. (Service: Amazon S3; > Status Code: 400; Error Code: RequestTimeout; Request ID: xxx; S3 Extended > Request ID: xxx, S3 Extended Request ID: xxx > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056){noformat} > This causes a restart of flink job, which is often able to recover from, but > under heavy load, this can become very frequent. > > Turning on debug logs you can find the following relevant stack trace: > {noformat} > 2018-12-17 05:55:46,546 DEBUG > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient - FYI: > failed to reset content inputstream before throwing up > java.io.IOException: Resetting to invalid mark > at java.io.BufferedInputStream.reset(BufferedInputStream.java:448) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkBufferedInputStream.reset(SdkBufferedInputStream.java:106) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:168) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.lastReset(AmazonHttpClient.java:1145) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1070) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:3306) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3291) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.uploadPart(S3AFileSystem.java:1576) > at >
[jira] [Commented] (FLINK-11187) StreamingFileSink with S3 backend transient socket timeout issues
[ https://issues.apache.org/jira/browse/FLINK-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16723402#comment-16723402 ] Addison Higham commented on FLINK-11187: See [https://lists.apache.org/thread.html/87eb8e16abad09232da5fbc6999c19c4fba0f16d641c565e62096564@%3Cuser.flink.apache.org%3E] for the first discussion of this on the mailing list. > StreamingFileSink with S3 backend transient socket timeout issues > -- > > Key: FLINK-11187 > URL: https://issues.apache.org/jira/browse/FLINK-11187 > Project: Flink > Issue Type: Bug > Components: FileSystem, Streaming Connectors >Affects Versions: 1.7.0, 1.7.1 >Reporter: Addison Higham >Assignee: Addison Higham >Priority: Major > Fix For: 1.7.2 > > > When using the StreamingFileSink with S3A backend, occasionally, errors like > this will occur: > {noformat} > Caused by: > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: > Your socket connection to the server was not read from or written to within > the timeout period. Idle connections will be closed. (Service: Amazon S3; > Status Code: 400; Error Code: RequestTimeout; Request ID: xxx; S3 Extended > Request ID: xxx, S3 Extended Request ID: xxx > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056){noformat} > This causes a restart of flink job, which is often able to recover from, but > under heavy load, this can become very frequent. > > Turning on debug logs you can find the following relevant stack trace: > {noformat} > 2018-12-17 05:55:46,546 DEBUG > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient - FYI: > failed to reset content inputstream before throwing up > java.io.IOException: Resetting to invalid mark > at java.io.BufferedInputStream.reset(BufferedInputStream.java:448) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkBufferedInputStream.reset(SdkBufferedInputStream.java:106) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:168) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.lastReset(AmazonHttpClient.java:1145) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1070) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:3306) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3291) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.uploadPart(S3AFileSystem.java:1576) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.WriteOperationHelper.lambda$uploadPart$8(WriteOperationHelper.java:474) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:260) > at >