Hi all,

I am looking at the FLIP-27 proposal which refactored the source
interfaces. As per the flip there is mention of multi-split, multi-threaded
readers [1] but i dont see these implemented. The main support i see is for
a single threaded multiplex reader [2]. This poses an issue for connectors
like DDB Streams and Kinesis where the aws supported libraries like KCL [3]
support one thread per shard, where as in Flink connector, there is a queue
of shards per reader [4]. This queueing nature creates problems like a
single split which might be taking more time can block other splits, and
does not seem like the right experience. I would like to understand whether
there is a mechanism to better support such multi-threaded readers rather
than relying on this single-threaded approach?

Thanks
Abhi



[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95653746#FLIP27:RefactorSourceInterface-BaseImplementationBaseimplementationandhigh-levelreaders
[2]
https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/source/reader/SingleThreadMultiplexSourceReaderBase.java
[3] https://github.com/awslabs/amazon-kinesis-client
[4]
https://github.com/apache/flink-connector-aws/blob/main/flink-connector-aws/flink-connector-dynamodb/src/main/java/org/apache/flink/connector/dynamodb/source/reader/PollingDynamoDbStreamsShardSplitReader.java

Reply via email to