Re: [DISCUSS] Introduce Multiple Tables Sink in Paimon Datastream API

Kevin Cheng Thu, 13 Feb 2025 20:05:07 -0800

Hi jingsong,

Thanks for you quick reply.


As far as I concern, for Paimon community perspective:

   1. It's better reuse the same logic for both SQL and datastream
   connector API;
   2. datastream API should support multiple table insertion;
   3. datastream API should take schema evolution into account.

The first one above is what normally all connectors do, and so the current
Paimon datastream API does (only single table insertion is supported).
The second and third one are basically requested by Flink CDC framework (no
need to restart the job when a new table needs to be synchronized to
Paimon).

Maybe we can contribute the Flink CDC Paimon Sink to the Paimon community?


Since currently all multiple table sink datastream API requests are from
Flink CDC framework, I basically agree that.

But if we take Flink CDC as one of datastream API user (which has its
particular requests), there is another choice that we do not need to
involve any of Flink CDC APIs.

However there is another problem needs to be considered:

Even if we move Flink CDC Paimon Sink to the Paimon community, will it
reuse the current Paimon sink code or we just add another module named
`flink-cdc`?

If we would like to reuse the same Paimon sink code, wow, it's not a easy
work, all operators and topologies are supposed to be rewrite. I've tried
on my own, but sadly, I'm not a Paimon expert and the work may need to be
cooperated.

Best,
Cong Cheng

Re: [DISCUSS] Introduce Multiple Tables Sink in Paimon Datastream API

Reply via email to