[
https://issues.apache.org/jira/browse/FLINK-29225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
lincoln lee updated FLINK-29225:
--------------------------------
Description:
Currently if the interval between the arrival of the delete message and the
insert/update message exceeds state ttl, the delete message was ignored
incorrectly in `SinkUpsertMaterializer`. This will cause wrong result in
corresponding sink table(dirty data left).
1. if state ttl is set to '10 hour', then the following delete message will be
ignored (the '+I (1, a1)' will be left in sink table forever)
{code:java}
00:00:01 +I (1, a1)
10:00:02 -D (1, a1)
{code}
2. but another contrast case which will wrongly delete data in sink table if we
send delete message when state staled
{code:java}
00:00:01 +I (1, a1)
00:00:02 +I (1, a2)
10:00:03 -D (1, a1)
{code}
compare the two choice of current implementation and eager deletion, the former
will cause dirty data left, but the later will cause some data lost(seems the
former is less harmful..)
was:
Currently if the interval between the arrival of the delete message and the
insert/update message exceeds state ttl, the delete message was ignored
incorrectly in `SinkUpsertMaterializer`. This will cause wrong result in
corresponding sink table(dirty data left).
1. if state ttl is set to '10 hour', then the following delete message will be
ignored (the '+I (1, a1)' will be left in sink table forever)
{code:java}
00:00:01 +I (1, a1)
10:00:02 -D (1, a1)
{code}
2. but another contrast case which will wrongly delete data in sink table if we
send delete message when state staled
{code}
00:00:01 +I (1, a1)
00:00:02 +I (1, a2)
10:00:03 -D (1, a1)
{code}
> Delete message incorrectly ignored in SinkUpsertMaterializer
> ------------------------------------------------------------
>
> Key: FLINK-29225
> URL: https://issues.apache.org/jira/browse/FLINK-29225
> Project: Flink
> Issue Type: Bug
> Affects Versions: 1.14.5, 1.15.2
> Reporter: lincoln lee
> Priority: Major
>
> Currently if the interval between the arrival of the delete message and the
> insert/update message exceeds state ttl, the delete message was ignored
> incorrectly in `SinkUpsertMaterializer`. This will cause wrong result in
> corresponding sink table(dirty data left).
>
> 1. if state ttl is set to '10 hour', then the following delete message will
> be ignored (the '+I (1, a1)' will be left in sink table forever)
> {code:java}
> 00:00:01 +I (1, a1)
> 10:00:02 -D (1, a1)
> {code}
>
> 2. but another contrast case which will wrongly delete data in sink table if
> we send delete message when state staled
> {code:java}
> 00:00:01 +I (1, a1)
> 00:00:02 +I (1, a2)
> 10:00:03 -D (1, a1)
> {code}
>
> compare the two choice of current implementation and eager deletion, the
> former will cause dirty data left, but the later will cause some data
> lost(seems the former is less harmful..)
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)