Krzysztof Dziolak created FLINK-35815:
-----------------------------------------
Summary: KinesisProxySyncV2 doesn't always retry throttling
exceptions.
Key: FLINK-35815
URL: https://issues.apache.org/jira/browse/FLINK-35815
Project: Flink
Issue Type: Bug
Components: Connectors / Kinesis
Affects Versions: 1.16.1, 1.15.4, aws-connector-4.2.0, aws-connector-4.3.0
Reporter: Krzysztof Dziolak
Assignee: Aleksandr Pilipenko
Fix For: aws-connector-4.4.0
Problem:
When FlinkKinesisConsumer is configured with legacy watermarking system, it is
unable to take a savepoint during stop-with-savepoint, and will get stuck
indefinitely.
{code:java}
FlinkKinesisConsumer src = new FlinkKinesisConsumer("YourStreamHere", new
SimpleStringSchema(), consumerConfig);
// Set up watermark assigner on Kinesis source
src.setPeriodicWatermarkAssigner(...);
// Set up watermark tracker on Kinesis source
src.setWatermarkTracker(...);{code}
*Why does it get stuck?*
When watermarks are setup, the `shardConsumer` and `recordEmitter` thread
communicate using asynchronous queue.
On stop-with-savepoint, shardConsumer waits for queue to empty before
continuing. recordEmitter is terminated before queue is empty. As such, queue
is never going to be empty, and app gets stuck indefinitely.
*Workarounds*
Use the new watermark framework
{code:java}
FlinkKinesisConsumer src = new FlinkKinesisConsumer("YourStreamHere", new
SimpleStringSchema(), consumerConfig);
env.addSource(src)
// Set up watermark strategy with both watermark assigner and watermark tracker
.assignTimestampsAndWatermarks(WatermarkStrategy.forMonotonousTimestamps()){code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)