[
https://issues.apache.org/jira/browse/FLINK-18825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Flink Jira Bot updated FLINK-18825:
-----------------------------------
Labels: stale-assigned (was: )
> Support to process CDC message in batch mode
> --------------------------------------------
>
> Key: FLINK-18825
> URL: https://issues.apache.org/jira/browse/FLINK-18825
> Project: Flink
> Issue Type: Sub-task
> Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table
> SQL / Ecosystem
> Reporter: Jark Wu
> Assignee: Nicholas Jiang
> Priority: Major
> Labels: stale-assigned
>
> Currently, processing CDC is only supported for streaming mode. However,
> there is also cases to process CDC data in batch mode. For example, load,
> materialize and compress CDC data into Hive using parquet format.
> Interpreting changelog in batch mode can be supported by adding a
> MaterializeOperator after the CDC source operator. The MaterializeOperator
> acts like a HashAggregate and SortAggregate based on whether all the fields
> are fixed-length. It will consume all the input RowDatas and replace/remove
> rows depending on the flag in the header.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)