Hi Cong, Thanks for driving this discussion.
> IMO, the sink logic is better maintained in one place and paimon its own. Do you have any suggestions? Maybe we can contribute the Flink CDC Paimon Sink to the Paimon community? Or you have other suggestions? Best, Jingsong On Thu, Feb 13, 2025 at 3:17 PM Kevin Cheng <[email protected]> wrote: > > Hi devs > Currently, only single table is supported in Paimon sink connector and the > schema is unchanged duration the insertion. It's fine in Flink SQL API. The > problems occur in CDC data synchronization usage. > In CDC source data synchronization, the data and the schema are supposed to > be synchronized into Paimon downstream tables. Take MySql as the source > upstream example, MySqlSyncTableAction > <https://paimon.apache.org/docs/1.0/api/java/org/apache/paimon/flink/action/cdc/mysql/MySqlSyncTableAction> > and MySqlSyncDatabaseAction > <https://paimon.apache.org/docs/1.0/api/java/org/apache/paimon/flink/action/cdc/mysql/MySqlSyncDatabaseAction> > are > provided to be used in such cases. However there are two challenges to be > solved in developing these two actions: > 1. schema evolution is no considered in origin Paimon sink connector; > 2. Multiple table sink is not supported. > To address these issues, the sink logic has been rewritten in a separate > cdc module: > 1. Retryable writer is introduced in paimon writer to synchronized with > newest schema changes, so the origin writer in paimon is not reused; > 2. the dev has to introduce a new multiple table sink so that the job is > not required to be restarted when new source table need to be synchronized > (but only fixed bucket mode is supported). > In fact, there are two sink logic maintained in paimon community. > Things doesn't get better when paimon sink is introduced in Flink CDC > community. As a matter of fact the sink paimon logic is rewritten based on > Flink Sink API. > There are totally three relatively separated code splits to maintained the > sink logic of paimon: two in paimon community and one in flink cdc > community. > In fact, we already found some issues during the rewritten in flink cdc > community, such as the 'commity user' usage is different from the paimon > community. > IMO, the sink logic is better maintained in one place and paimon its own. > Since there are still a lot of work to do to achieve that, I would like > to initiate a discussion in paimon devs, any thoughts are warmly welcomed. > > Best, > Cong Cheng
