Hi everyone,

I'd like to propose FLIP-564: Support *FROM_CHANGELOG* and *TO_CHANGELOG*
built-in PTFs [1] for discussion.

Flink's DataStream API offers flexible methods like toChangelogStream() and
fromChangelogStream() to work with changelog streams, but SQL users
currently lack this capability. This FLIP introduces two built-in Process
Table Functions (PTFs) to bring *similar* functionality with additional
features to Flink SQL:


   - *FROM_CHANGELOG*: Converts an append-only stream of CDC records into a
   dynamic table, enabling, for example, custom CDC connector implementations
   directly in SQL.
   - *TO_CHANGELOG*: Converts a dynamic table back into an append-only
   changelog stream - the first operator that makes it possible to convert a
   retract/upsert stream back to append in SQL.


Both PTFs support flexible operation mapping (e.g., Debezium-style 'c',
'u', 'd' codes), before/after image handling, configurable state TTL, and
watermark-based ordering. They are designed to work symmetrically, so
FROM_CHANGELOG(TO_CHANGELOG(table)) round-trips correctly.

The naming follows the existing DataStream API convention. Alternative
names were considered (e.g., ENCODE_CHANGELOG, DEMATERIALIZE) - they are
under the rejected alternatives. If you have any better suggestions, feel
free to share them here.

Looking forward to your feedback and thoughts.

Kind regards,
Gustavo de Morais

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-564%3A+Support+FROM_CHANGELOG+and+TO_CHANGELOG+built-in+PTFs

Reply via email to