[
https://issues.apache.org/jira/browse/FLINK-17638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Flink Jira Bot updated FLINK-17638:
-----------------------------------
Labels: auto-deprioritized-major stale-minor (was:
auto-deprioritized-major)
I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help
the community manage its development. I see this issues has been marked as
Minor but is unassigned and neither itself nor its Sub-Tasks have been updated
for 180 days. I have gone ahead and marked it "stale-minor". If this ticket is
still Minor, please either assign yourself or give an update. Afterwards,
please remove the label or in 7 days the issue will be deprioritized.
> FlinkKafkaConsumerBase restore from empty state will be set consume from
> earliest forced
> ----------------------------------------------------------------------------------------
>
> Key: FLINK-17638
> URL: https://issues.apache.org/jira/browse/FLINK-17638
> Project: Flink
> Issue Type: Bug
> Components: Connectors / Kafka
> Affects Versions: 1.9.0, 1.9.3, 1.10.0
> Environment: Flink 1.9.0
> kafka 1.1.0
> jdk 1.8
> Reporter: chenchuangchuang
> Priority: Minor
> Labels: auto-deprioritized-major, stale-minor
>
> my work target and data is like this :
> # i need count the number of post per user create last 30 days in my system
> # the total and realtime data is in MYSQL
> # i can get increment MYSQL binlog from kafka-1.1.1 ( it just store the
> last 7 days binlog), the topic name is "binlog_post_topic"
> # so , i have to combine the MYSQL data and the binlog data
>
> i do it in this way:
> # first , i carry a snapshot of MYSQL data to kafka topic in order of
> create_time ( topic name is "init-post-topic"), and consume from kafka topic
> "init-post-topic" as flink data-stream with the SlidingEventTimeWindows
> # second, after the task do all the data in the topic "init-post-topic" , i
> create a save point for the task , call the save point save-point-a
> # third, i modify my code ,
> ## the data source is "binlog_post_topic" topic of kafka ,
> ## other operotor will not change,
> ## and the "binlog_post_topic" is setted consuming from special timestamp
> (when the snapshot of MYSQL create )
> # forth, i restart my task from save-point-a
> but i find the kafka consumer for the "binlog_post_topic" do not consume data
> from the timestamp i setted, but from the earlist, i find the log in the
> task manager
> {code:java}
> //代码占位符
> 2020-05-11 17:20:47,228 INFO
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase -
> Consumer subtask 0 restored state: {}.
> ...
> 2020-05-12 20:14:52,641 INFO
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase -
> Consumer subtask 0 will start reading 1 partitions with offsets in restored
> state: {KafkaTopicPartition{topic='binlog_post_topic',
> partition=0}=-915623761775}
> 2020-05-11 17:20:47,414 INFO
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase -
> Consumer subtask 0 creating fetcher with offsets
> {KafkaTopicPartition{topic='binlog_post_topic', partition=0}=-915623761775}.
> {code}
> i guess this may be caused by the FlinkKafkaConsumerBase
> then i find code like this
> in the method FlinkKafkaConsumerBase.initializeState()
> {code:java}
> //代码占位符
> if (context.isRestored() && !restoredFromOldState) {
> restoredState = new TreeMap<>(new KafkaTopicPartition.Comparator());
> ....{code}
> this code mean that if a task is restart from the save point ,that
> restoredState will not be null, at least be an empty TreeMap;
> and in FlinkKafkaConsumerBase.open()
> {code:java}
> //代码占位符
> if (restoredState != null) {
> for (KafkaTopicPartition partition : allPartitions) {
> if (!restoredState.containsKey(partition)) {
> restoredState.put(partition,
> KafkaTopicPartitionStateSentinel.EARLIEST_OFFSET);
> }
> }
> {code}
> in this place will init the consumer , if a task is restart from a save-point
> , restoredState at least is an empty TreeMap, then in this code , the
> consumer will be setted consume from
> KafkaTopicPartitionStateSentinel.EARLIEST_OFFSET
> i change this code like this
> {code:java}
> //代码占位符
> if (restoredState != null && !restoredState.isEmpty()) {
> ....
> {code}
>
> and this work well for me .
>
> 刚注意到这是一个中文jira, 哭晕
>
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)