[jira] [Commented] (FLINK-25205) Optimize SinkUpsertMaterializer

Jingsong Lee (Jira) Tue, 07 Dec 2021 18:25:04 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-25205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454930#comment-17454930
 ]


Jingsong Lee commented on FLINK-25205:
--------------------------------------

[~twalthr] Did you mean UPDATE_AFTER? We can leave it alone, but it is indeed a 
new "INSERT" data for the upsertMateriazer node.

Actually, our storage (Like Kafka or others) does not have the ability to 
separate before and after, and currently there would not actually exist to say 
that before and after would appear in different checkpoints.
Currently can only be that insert and delete are in different checkpoints.

> Optimize SinkUpsertMaterializer
> -------------------------------
>
>                 Key: FLINK-25205
>                 URL: https://issues.apache.org/jira/browse/FLINK-25205
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table SQL / Runtime
>            Reporter: Jingsong Lee
>            Priority: Major
>
> SinkUpsertMaterializer maintains incoming records in state corresponding to 
> the upsert keys and generates an upsert view for the downstream operator.
> It is intended to solve the messy order problem caused by the upstream 
> computation, but it stores the data in the state, which will get bigger and 
> bigger.
> If we can think that the disorder only occurs within the checkpoint, we can 
> consider cleaning up the state of each checkpoint, which can control the size 
> of the state.
> We can consider adding an optimized config option first.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (FLINK-25205) Optimize SinkUpsertMaterializer

Reply via email to