hajimeni created FLINK-37918:
--------------------------------

             Summary: Restore the ability to set an interval for GetRecords 
calls to Kinesis shards.
                 Key: FLINK-37918
                 URL: https://issues.apache.org/jira/browse/FLINK-37918
             Project: Flink
          Issue Type: Improvement
          Components: Connectors / Kinesis
    Affects Versions: aws-connector-5.0.0
            Reporter: hajimeni


h3. Background

The previous Flink Kinesis connector (flink-connector-kinesis) provided a 
configuration parameter, SHARD_GETRECORDS_INTERVAL_MILLIS, which allowed users 
to set a specific interval between GetRecords calls for each shard. This 
functionality is absent in the new AWS Kinesis Streams connector 
(flink-connector-aws-kinesis-streams).
h3. Problem
 
The lack of a configurable interval for GetRecords calls in the new connector 
(KinesisStreamsSource) poses a significant challenge in scenarios with multiple 
consumers reading from the same Kinesis stream. Without the ability to increase 
the interval between GetRecords calls, consumers can easily exceed the AWS 
Kinesis limit of five GetRecords calls per second per shard. This leads to 
several issues: * Wasted API Calls and Increased Costs: Continuous, rapid calls 
that are likely to be throttled are inefficient and can lead to increased costs.
 * Operational Instability: In a multi-tenant or multi-application environment, 
the absence of this control makes it difficult to ensure stable and predictable 
data consumption across all consumers.
 * AWS documentation recommends adjusting the frequency of GetRecords calls to 
avoid these issues, especially when multiple consumers are involved. You can 
find this recommendation in the AWS Kinesis Developer Guide. (see: 
https://docs.aws.amazon.com/streams/latest/dev/kinesis-low-latency.html )

h3. Feature Request

We request the re-introduction of a configuration option, similar to 
SHARD_GETRECORDS_INTERVAL_MILLIS, in the flink-connector-aws-kinesis-streams 
connector. This would allow users to effectively manage the rate of GetRecords 
calls per shard, thereby preventing API throttling and ensuring the stability 
and efficiency of Flink applications that consume data from Kinesis Data 
Streams.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to