[
https://issues.apache.org/jira/browse/FLINK-20114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17238168#comment-17238168
]
Robert Metzger commented on FLINK-20114:
----------------------------------------
Testing Issue 6:
I created a topic with 64 partitions, and started the job, which started
failing with:
{code}
2020-11-24 15:25:00,530 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: Kafka
Source (5/6) (74a3b791c148ea9ce9d17ae59f59becc) switched from RUNNING to FAILED
on org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot@10db45bb.
java.lang.Exception: Could not perform checkpoint 11 for operator Source: Kafka
Source (5/6)#10.
at
org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpoint(StreamTask.java:866)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$triggerCheckpointAsync$8(StreamTask.java:831)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:47)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:78)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:302)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:184)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:575)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:539)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:547)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at java.lang.Thread.run(Thread.java:832) ~[?:?]
Caused by: org.apache.flink.runtime.checkpoint.CheckpointException: Could not
complete snapshot 11 for operator Source: Kafka Source (5/6)#10. Failure
reason: Checkpoint was declined.
at
org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.snapshotState(StreamOperatorStateHandler.java:226)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.snapshotState(StreamOperatorStateHandler.java:158)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:343)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.checkpointStreamOperator(SubtaskCheckpointCoordinatorImpl.java:603)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.buildOperatorSnapshotFutures(SubtaskCheckpointCoordinatorImpl.java:529)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.takeSnapshotSync(SubtaskCheckpointCoordinatorImpl.java:496)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.checkpointState(SubtaskCheckpointCoordinatorImpl.java:266)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$performCheckpoint$9(StreamTask.java:924)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:47)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:914)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpoint(StreamTask.java:857)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
... 10 more
Caused by: org.apache.flink.util.SerializedThrowable: Invalid negative offset
at
org.apache.kafka.clients.consumer.OffsetAndMetadata.<init>(OffsetAndMetadata.java:50)
~[?:?]
at
org.apache.kafka.clients.consumer.OffsetAndMetadata.<init>(OffsetAndMetadata.java:69)
~[?:?]
at
org.apache.flink.connector.kafka.source.reader.KafkaSourceReader.snapshotState(KafkaSourceReader.java:96)
~[?:?]
at
org.apache.flink.streaming.api.operators.SourceOperator.snapshotState(SourceOperator.java:264)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.snapshotState(StreamOperatorStateHandler.java:197)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.snapshotState(StreamOperatorStateHandler.java:158)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:343)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.checkpointStreamOperator(SubtaskCheckpointCoordinatorImpl.java:603)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.buildOperatorSnapshotFutures(SubtaskCheckpointCoordinatorImpl.java:529)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.takeSnapshotSync(SubtaskCheckpointCoordinatorImpl.java:496)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.checkpointState(SubtaskCheckpointCoordinatorImpl.java:266)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$performCheckpoint$9(StreamTask.java:924)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:47)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:914)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpoint(StreamTask.java:857)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
... 10 more
{code}
> Test Kafka Source based on the new Source API
> ----------------------------------------------
>
> Key: FLINK-20114
> URL: https://issues.apache.org/jira/browse/FLINK-20114
> Project: Flink
> Issue Type: Sub-task
> Components: Connectors / Kafka
> Affects Versions: 1.12.0
> Reporter: Robert Metzger
> Assignee: Roman Khachatryan
> Priority: Critical
> Fix For: 1.12.0
>
> Attachments: Screenshot 2020-11-24 at 15.14.35.png
>
>
> Feature introduced in https://issues.apache.org/jira/browse/FLINK-18323
> --------
> [General Information about the Flink 1.12 release
> testing|https://cwiki.apache.org/confluence/display/FLINK/1.12+Release+-+Community+Testing]
> When testing a feature, consider the following aspects:
> - Is the documentation easy to understand
> - Are the error messages, log messages, APIs etc. easy to understand
> - Is the feature working as expected under normal conditions
> - Is the feature working / failing as expected with invalid input, induced
> errors etc.
> If you find a problem during testing, please file a ticket
> (Priority=Critical; Fix Version = 1.12.0), and link it in this testing ticket.
> During the testing, and once you are finished, please write a short summary
> of all things you have tested.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)