[ 
https://issues.apache.org/jira/browse/FLINK-23426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17383826#comment-17383826
 ] 

zoucao commented on FLINK-23426:
--------------------------------

Hi [~jark], [~twalthr], should we support this materialize operator in 
streaming mode ? At present, if the source is not insert_only, e.g mysql-cdc 
etc, some scenarios can not be supported, for example window aggregation 
metioned by https://issues.apache.org/jira/browse/FLINK-20281. If materialize 
operator can be used by specific SQL statement,we can fix it temporarily. I 
think it is great if flink provide a way to materialize changelog stream into 
insert-only stream in batch & streaming mode.

> Support changelog processing in batch mode
> ------------------------------------------
>
>                 Key: FLINK-23426
>                 URL: https://issues.apache.org/jira/browse/FLINK-23426
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Table SQL / API
>            Reporter: Timo Walther
>            Priority: Major
>
> The DataStream API can execute arbitrary DataStream programs when running in 
> batch mode. However, this is not the case for the Table API batch mode. E.g. 
> a source with non-insert only changes is not supported and updates/deletes 
> cannot be emitted.
> In theory, we could make this work by running the "stream mode" of the 
> planner (CDC transformations) on top of the "batch mode" of DataStream API 
> (specialized state backend, sorted inputs). It is up for discussion if and 
> how we expose such functionality.
> If we don't allow enabling incremental updates, we can also add a special 
> batch operator that materializes the incoming changes for a batch pipeline. 
> However, it would require "complete" CDC logs (i.e. no missing UPDATE_AFTER).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to