hajimeni created FLINK-37918: -------------------------------- Summary: Restore the ability to set an interval for GetRecords calls to Kinesis shards. Key: FLINK-37918 URL: https://issues.apache.org/jira/browse/FLINK-37918 Project: Flink Issue Type: Improvement Components: Connectors / Kinesis Affects Versions: aws-connector-5.0.0 Reporter: hajimeni
h3. Background The previous Flink Kinesis connector (flink-connector-kinesis) provided a configuration parameter, SHARD_GETRECORDS_INTERVAL_MILLIS, which allowed users to set a specific interval between GetRecords calls for each shard. This functionality is absent in the new AWS Kinesis Streams connector (flink-connector-aws-kinesis-streams). h3. Problem The lack of a configurable interval for GetRecords calls in the new connector (KinesisStreamsSource) poses a significant challenge in scenarios with multiple consumers reading from the same Kinesis stream. Without the ability to increase the interval between GetRecords calls, consumers can easily exceed the AWS Kinesis limit of five GetRecords calls per second per shard. This leads to several issues: * Wasted API Calls and Increased Costs: Continuous, rapid calls that are likely to be throttled are inefficient and can lead to increased costs. * Operational Instability: In a multi-tenant or multi-application environment, the absence of this control makes it difficult to ensure stable and predictable data consumption across all consumers. * AWS documentation recommends adjusting the frequency of GetRecords calls to avoid these issues, especially when multiple consumers are involved. You can find this recommendation in the AWS Kinesis Developer Guide. (see: https://docs.aws.amazon.com/streams/latest/dev/kinesis-low-latency.html ) h3. Feature Request We request the re-introduction of a configuration option, similar to SHARD_GETRECORDS_INTERVAL_MILLIS, in the flink-connector-aws-kinesis-streams connector. This would allow users to effectively manage the rate of GetRecords calls per shard, thereby preventing API throttling and ensuring the stability and efficiency of Flink applications that consume data from Kinesis Data Streams. -- This message was sent by Atlassian Jira (v8.20.10#820010)