[
https://issues.apache.org/jira/browse/FLINK-19300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Metzger updated FLINK-19300:
-----------------------------------
Affects Version/s: 1.8.0
> Timer loss after restoring from savepoint
> -----------------------------------------
>
> Key: FLINK-19300
> URL: https://issues.apache.org/jira/browse/FLINK-19300
> Project: Flink
> Issue Type: Bug
> Components: Runtime / State Backends
> Affects Versions: 1.8.0
> Reporter: Xiang Gao
> Priority: Blocker
> Fix For: 1.12.0, 1.11.3
>
>
> While using heap-based timers, we are seeing occasional timer loss after
> restoring program from savepoint, especially when using a remote savepoint
> storage (s3).
> After some investigation, the issue seems to be related to [this line in
> deserialization|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/io/PostVersionedIOReadableWritable.java#L65].
> When trying to check the VERSIONED_IDENTIFIER, the input stream may not
> guarantee filling the byte array, causing timers to be dropped for the
> affected key group.
> Should keep reading until expected number of bytes are actually read or if
> end of the stream has been reached.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)