[
https://issues.apache.org/jira/browse/FLINK-23802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated FLINK-23802:
-----------------------------------
Labels: pull-request-available (was: )
> [kinesis][efo] Reduce ReadTimeoutExceptions for Kinesis Consumer
> ----------------------------------------------------------------
>
> Key: FLINK-23802
> URL: https://issues.apache.org/jira/browse/FLINK-23802
> Project: Flink
> Issue Type: Improvement
> Components: Connectors / Kinesis
> Affects Versions: 1.12.0, 1.12.1, 1.12.2, 1.13.0, 1.12.3, 1.13.1, 1.12.4,
> 1.12.5, 1.13.2
> Reporter: Danny Cranmer
> Assignee: Danny Cranmer
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.14.0, 1.12.6, 1.13.3
>
>
> h3. Background
> The Kinesis EFO consumer uses an async AWS SDK Netty client to read records
> from Kinesis. When the client is inactive for 30 seconds a
> {{ReadTimeoutException}} is thrown by Netty. The consumer will terminate the
> subscription, backoff and retry. Jobs with high backpressure can result in
> frequent {{ReadTImeoutException}} and the frequent backoff and retry can
> cause unnecessary overhead.
> h3. What?
> Reduce/eliminate {{ReadTimeoutException}} from the EFO consumer
> h3. How?
>
> There are 2 improvements to be made:
> 1. Request next record from the Flink source thread rather than the AWS SDK
> response thread. This means that there will always be space in the input
> buffer queue. The AWS SDK async response thread is no longer blocking on this
> queue. Backpressure is now applied by the Flink source thread rather than the
> AWS SDK thread.
> 2. Increase the Read Timeout (30s) to be higher than the maximum Shard
> subscription duration (5m) and enable TCP keep alive
> h3. References
> This has already been implemented and tested in
> [amazon-kinesis-connector-flink|https://github.com/awslabs/amazon-kinesis-connector-flink]:
> - [Prevent SDK threads
> blocking|https://github.com/awslabs/amazon-kinesis-connector-flink/pull/40]
> - [Increase read timeout and enable TCP
> keepalive|https://github.com/awslabs/amazon-kinesis-connector-flink/pull/42]
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)