[
https://issues.apache.org/jira/browse/FLINK-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15839318#comment-15839318
]
ASF GitHub Bot commented on FLINK-4616:
---------------------------------------
Github user tzulitai commented on a diff in the pull request:
https://github.com/apache/flink/pull/3031#discussion_r97757847
--- Diff:
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java
---
@@ -101,7 +101,7 @@
* The assigner is kept in serialized form, to deserialize it into
multiple copies */
private SerializedValue<AssignerWithPunctuatedWatermarks<T>>
punctuatedWatermarkAssigner;
- private transient ListState<Tuple2<KafkaTopicPartition, Long>>
offsetsStateForCheckpoint;
+ private transient ListState<Tuple2<KafkaTopicPartition, Tuple2<Long,
Long>>> offsetsAndWatermarksStateForCheckpoint;
--- End diff --
I think we should switch to have a specific checkpointed state object
instead of continuing to "extend" the original Tuple. This will also be helpful
for compatibility for any future changes to the checkpointed state.
> Kafka consumer doesn't store last emmited watermarks per partition in state
> ---------------------------------------------------------------------------
>
> Key: FLINK-4616
> URL: https://issues.apache.org/jira/browse/FLINK-4616
> Project: Flink
> Issue Type: Bug
> Components: Kafka Connector
> Affects Versions: 1.1.1
> Reporter: Yuri Makhno
> Assignee: Roman Maier
>
> Kafka consumers stores in state only kafka offsets and doesn't store last
> emmited watermarks, this may go to wrong state when checkpoint is restored:
> Let's say our watermark is (timestamp - 10) and in case we have the following
> messages queue results will be different after checkpoint restore and during
> normal processing:
> A(ts = 30)
> B(ts = 35)
> ------ checkpoint goes here
> C(ts=15) -- this one should be filtered by next time window
> D(ts=60)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)