[
https://issues.apache.org/jira/browse/FLINK-23527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stephan Ewen updated FLINK-23527:
---------------------------------
Fix Version/s: 1.13.3
> Clarify `SourceFunction#cancel()` contract about interrupting
> -------------------------------------------------------------
>
> Key: FLINK-23527
> URL: https://issues.apache.org/jira/browse/FLINK-23527
> Project: Flink
> Issue Type: Bug
> Components: API / DataStream, Documentation
> Affects Versions: 1.13.1
> Reporter: Piotr Nowojski
> Assignee: Stephan Ewen
> Priority: Critical
> Labels: pull-request-available
> Fix For: 1.14.0, 1.13.3
>
>
> We should clarify the contract of {{SourceFunction#cancel()}}
> # source itself shouldn’t be interrupting the source thread
> # interrupt shouldn’t be expected in the clean cancellation case
> Interrupting the code on the clean shutdown path can cause failures when
> doing `stop-with-savepoint`. If source thread is interrupted during
> backpressure, this leaves network stack in invalid state, making it
> impossible to send {{EndOfPartitionEvent}} to complete the shutdown.
> In a bit more detail, when source thread is backpressured, network stack
> might have already sent a partial record and it could be waiting for a buffer
> to finish writing/serialising that record. If network stack is interrupted
> while waiting for that buffer, it will never resume writing/serialisation of
> the remaining part of that record, while downstream node will be expecting
> those bytes. If in this situation we attempt to emit anything (like
> {{EndOfPartitionEvent}}), this will most likely cause deserialisation errors
> on the downstream nodes.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)