[
https://issues.apache.org/jira/browse/FLINK-39169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Luca Occhipinti updated FLINK-39169:
------------------------------------
Description:
When running MySQL CDC in snapshot or initial mode (both streaming and batch)
In cloud environments like AWS Aurora/RDS, the connector requires to be in the
primary/writer database instance to retrieve the binlog position and then
continues running snapshot queries.
This creates unnecessary load on the primary/writer instance when performing
large snapshot reads, which can impact production workloads.
Usually this there are read replicas specifically designed to offload read
traffic.
However, the current implementation cannot leverage these replicas for snapshot
data reading.
The proposal is to use writer instance to get binlog position, use the reader
replica to run the snapshot queries, and if running in streaming mode, keep
using the writer to track binlog changes
was:
When running MySQL CDC in snapshot or initial mode (both streaming and batch)
In cloud environments like AWS Aurora/RDS, the connector requires to be in the
primary database instance to retrieve the binlog position and then continues
running snapshot queries.
This creates unnecessary load on the primary/writer instance when performing
large snapshot reads, which can impact production workloads.
Usually this there are read replicas specifically designed to offload read
traffic.
However, the current implementation cannot leverage these replicas for snapshot
data reading.
The proposal is to use writer instance to get binlog position, use the reader
replica to run the snapshot queries, and if running in streaming mode, keep
using the writer to track binlog changes
> [mysql-connector] Use reader instances to run snapshots
> -------------------------------------------------------
>
> Key: FLINK-39169
> URL: https://issues.apache.org/jira/browse/FLINK-39169
> Project: Flink
> Issue Type: Improvement
> Components: Flink CDC
> Reporter: Luca Occhipinti
> Priority: Major
>
> When running MySQL CDC in snapshot or initial mode (both streaming and batch)
> In cloud environments like AWS Aurora/RDS, the connector requires to be in
> the primary/writer database instance to retrieve the binlog position and then
> continues running snapshot queries.
> This creates unnecessary load on the primary/writer instance when performing
> large snapshot reads, which can impact production workloads.
> Usually this there are read replicas specifically designed to offload read
> traffic.
> However, the current implementation cannot leverage these replicas for
> snapshot data reading.
> The proposal is to use writer instance to get binlog position, use the reader
> replica to run the snapshot queries, and if running in streaming mode, keep
> using the writer to track binlog changes
--
This message was sent by Atlassian Jira
(v8.20.10#820010)