[
https://issues.apache.org/jira/browse/FLINK-25205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454472#comment-17454472
]
Lsw_aka_laplace edited comment on FLINK-25205 at 12/7/21, 9:19 AM:
-------------------------------------------------------------------
UPDATE
After discussing with [~lzljs3620320], my question is even not a question in
this situation. Just ignore me.
Thx for [~lzljs3620320] for your patience.
------
Hi [~lzljs3620320], After looking through `SinkUpsertMaterializer`, I have
one question about this issue. Is all data from one changelog stream naturally
split by checkpoints? Assuming that a UPATE_BEFORE row and a UPDATE_AFTER row
are coincidently separated by checkpoint T. The UPATE_BEFORE row belongs to
checkpoint T but the UPDATE_AFTER row belongs to checkpoint T+1, as far as I am
concerned. What shall we do in this situation?
If not, would you mind giving some explanation on this?
Cheers~
was (Author: neighborhood):
Hi [~lzljs3620320], After looking through `SinkUpsertMaterializer`, I have
one question about this issue. Is all data from one changelog stream naturally
split by checkpoints? Assuming that a UPATE_BEFORE row and a UPDATE_AFTER row
are coincidently separated by checkpoint T. The UPATE_BEFORE row belongs to
checkpoint T but the UPDATE_AFTER row belongs to checkpoint T+1, as far as I am
concerned. What shall we do in this situation?
If not, would you mind giving some explanation on this?
Cheers~
> Optimize SinkUpsertMaterializer
> -------------------------------
>
> Key: FLINK-25205
> URL: https://issues.apache.org/jira/browse/FLINK-25205
> Project: Flink
> Issue Type: Improvement
> Components: Table SQL / Runtime
> Reporter: Jingsong Lee
> Priority: Major
>
> SinkUpsertMaterializer maintains incoming records in state corresponding to
> the upsert keys and generates an upsert view for the downstream operator.
> It is intended to solve the messy order problem caused by the upstream
> computation, but it stores the data in the state, which will get bigger and
> bigger.
> If we can think that the disorder only occurs within the checkpoint, we can
> consider cleaning up the state of each checkpoint, which can control the size
> of the state.
> We can consider adding an optimized config option first.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)