[
https://issues.apache.org/jira/browse/FLINK-21839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17319985#comment-17319985
]
Yuan Mei commented on FLINK-21839:
----------------------------------
Hey [~lintingbin], with the info provided in the ticket, I can not tell it is a
but or not.
I think that's the expected behavior (not saying it is the right behavior, but
expected). Before diving into the detail, would you mind sharing a bit more
info with your pipeline? Is it using the old source or a new source (FLIP 27),
what does your application look like? Is there any suspicious log?
The reason why `TaskSink` invokes after snapshot:
`stop-with-drain` procedure contains two stages:
1. insert max_watermark and create the savepoint
2. stop the source to trigger `END_OF_PARTITION` event to stop the job.
In the gap between "draining window state and create savepoint" and "handling
END_OF_PARTITION", there is still data flow in from the source.
?? SourceStreamTask#finishTask
/**
* Currently stop with savepoint relies on the EndOfPartitionEvents
propagation and performs
* clean shutdown after the stop with savepoint (which can produce some
records to process
* after the savepoint while stopping). If we interrupt source thread,
we might leave the
* network stack in an inconsistent state. So, if we want to relay on
the clean shutdown, we
* can not interrupt the source thread.
*/??
You can refer to "FLINK-21133"and FLIP-147 for a bit more detail.
> SinkFunction snapshotState don't snapshot all data when trigger a
> stop-with-drain savepoint
> -------------------------------------------------------------------------------------------
>
> Key: FLINK-21839
> URL: https://issues.apache.org/jira/browse/FLINK-21839
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Checkpointing
> Affects Versions: 1.12.2
> Reporter: Darcy Lin
> Assignee: Yuan Mei
> Priority: Critical
> Attachments: TestSink.java
>
>
> This problem was discovered when I was developing the flink code. In my flink
> code, my custom sink don't send all data that be produced by event_time
> window when trigger stop-with-drain savepoint .
> TestSink.java is a example that SinkFunction invoke() continues to run afterÂ
> snapshotState() executed when trigger a stop-with-drain savepoint by rest api.
> {code:java}
> //TaskSink.java log
> sink open
> invoke: 0
> invoke: 1
> invoke: 2
> invoke: 3
> invoke: 4
> invoke: 5
> invoke: 6
> invoke: 7
> invoke: 8
> invoke: 9
> ...
> invoke: 425
> invoke: 426
> invoke: 427
> snapshotState
> invoke: 428 // It should be executed before snapshotState.
> sink close{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)