Poorvank Bhatia created FLINK-38414:
---------------------------------------
Summary: Add Vitess Pipeline Connector with Parallel Shard Reading
Support
Key: FLINK-38414
URL: https://issues.apache.org/jira/browse/FLINK-38414
Project: Flink
Issue Type: New Feature
Components: Flink CDC
Reporter: Poorvank Bhatia
The current Vitess CDC connector
([flink-connector-vitess-cdc|https://github.com/apache/flink-cdc/tree/master/flink-cdc-connect/flink-cdc-source-connectors/flink-connector-vitess-cdc])
is based on
[Debezium|https://github.com/apache/flink-cdc/blob/master/flink-cdc-connect/flink-cdc-source-connectors/flink-connector-vitess-cdc/src/main/java/org/apache/flink/cdc/connectors/vitess/VitessSource.java#L256]
and has a critical limitation: it's hardcoded to use a single task
(tasks.max=1), making it impossible to parallelize reading from sharded Vitess
keyspaces. This is a major bottleneck for production deployments with hundreds
of shards.
This ticket proposes adding a new pipeline connector
(flink-cdc-pipeline-connector-vitess) that leverages FLIP-27 Source API to
enable parallel reading with one worker per shard.
References{*}{*}
- https://vitess.io/docs/concepts/vstream/
- https://cwiki.apache.org/confluence/display/FLINK/FLIP-27
--
This message was sent by Atlassian Jira
(v8.20.10#820010)