[ 
https://issues.apache.org/jira/browse/FLINK-39169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luca Occhipinti updated FLINK-39169:
------------------------------------
    Description: 
When running MySQL CDC in snapshot or initial mode (both streaming and batch) 
In cloud environments like AWS Aurora/RDS, the connector requires to be in the 
primary/writer database instance to retrieve the binlog position and then 
continues running snapshot queries. 

This creates unnecessary load on the primary/writer instance when performing 
large snapshot reads, which can impact production workloads.

Usually this there are read replicas specifically designed to offload read 
traffic.
However, the current implementation cannot leverage these replicas for snapshot 
data reading.

The proposal is to use writer instance to get binlog position, use the reader 
replica to run the snapshot queries, and if running in streaming mode, keep 
using the writer to track binlog changes

  was:
When running MySQL CDC in snapshot or initial mode (both streaming and batch) 
In cloud environments like AWS Aurora/RDS, the connector requires to be in the 
primary database instance to retrieve the binlog position and then continues 
running snapshot queries. 

This creates unnecessary load on the primary/writer instance when performing 
large snapshot reads, which can impact production workloads.

Usually this there are read replicas specifically designed to offload read 
traffic.
However, the current implementation cannot leverage these replicas for snapshot 
data reading.

The proposal is to use writer instance to get binlog position, use the reader 
replica to run the snapshot queries, and if running in streaming mode, keep 
using the writer to track binlog changes


> [mysql-connector] Use reader instances to run snapshots
> -------------------------------------------------------
>
>                 Key: FLINK-39169
>                 URL: https://issues.apache.org/jira/browse/FLINK-39169
>             Project: Flink
>          Issue Type: Improvement
>          Components: Flink CDC
>            Reporter: Luca Occhipinti
>            Priority: Major
>
> When running MySQL CDC in snapshot or initial mode (both streaming and batch) 
> In cloud environments like AWS Aurora/RDS, the connector requires to be in 
> the primary/writer database instance to retrieve the binlog position and then 
> continues running snapshot queries. 
> This creates unnecessary load on the primary/writer instance when performing 
> large snapshot reads, which can impact production workloads.
> Usually this there are read replicas specifically designed to offload read 
> traffic.
> However, the current implementation cannot leverage these replicas for 
> snapshot data reading.
> The proposal is to use writer instance to get binlog position, use the reader 
> replica to run the snapshot queries, and if running in streaming mode, keep 
> using the writer to track binlog changes



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to