[ 
https://issues.apache.org/jira/browse/FLINK-23802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer resolved FLINK-23802.
-----------------------------------
    Resolution: Fixed

> [kinesis][efo] Reduce ReadTimeoutExceptions for Kinesis Consumer
> ----------------------------------------------------------------
>
>                 Key: FLINK-23802
>                 URL: https://issues.apache.org/jira/browse/FLINK-23802
>             Project: Flink
>          Issue Type: Improvement
>          Components: Connectors / Kinesis
>    Affects Versions: 1.12.0, 1.12.1, 1.12.2, 1.13.0, 1.12.3, 1.13.1, 1.12.4, 
> 1.12.5, 1.13.2
>            Reporter: Danny Cranmer
>            Assignee: Danny Cranmer
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.14.0, 1.12.6, 1.13.3
>
>
> h3. Background
> The Kinesis EFO consumer uses an async AWS SDK Netty client to read records 
> from Kinesis. When the client is inactive for 30 seconds a 
> {{ReadTimeoutException}} is thrown by Netty. The consumer will terminate the 
> subscription, backoff and retry. Jobs with high backpressure can result in 
> frequent {{ReadTImeoutException}} and the frequent backoff and retry can 
> cause unnecessary overhead.
> h3. What?
> Reduce/eliminate {{ReadTimeoutException}} from the EFO consumer
> h3. How?
>   
> There are 2 improvements to be made:
> 1. Request next record from the Flink source thread rather than the AWS SDK 
> response thread. This means that there will always be space in the input 
> buffer queue. The AWS SDK async response thread is no longer blocking on this 
> queue. Backpressure is now applied by the Flink source thread rather than the 
> AWS SDK thread.
> 2. Increase the Read Timeout (30s) to be higher than the maximum Shard 
> subscription duration (5m) and enable TCP keep alive
> h3. References
> This has already been implemented and tested in 
> [amazon-kinesis-connector-flink|https://github.com/awslabs/amazon-kinesis-connector-flink]:
> - [Prevent SDK threads 
> blocking|https://github.com/awslabs/amazon-kinesis-connector-flink/pull/40]
> - [Increase read timeout and enable TCP 
> keepalive|https://github.com/awslabs/amazon-kinesis-connector-flink/pull/42]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to