[ 
https://issues.apache.org/jira/browse/FLINK-14076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16931704#comment-16931704
 ] 

Jeffrey Martin commented on FLINK-14076:
----------------------------------------

It looks like this is due to a KafkaException getting thrown while Flink is 
trying to checkpoint. The KafkaException gets wrapped in a 'DeclineCheckpoint' 
and sent back to the JobManager. The JobManager can't deserialize it because 
the JobManager's classpath does not include org.apache.kafka:kafka-clients by 
default.

 

Gonna dig in some more and see if I can figure out which exception is at fault.

> 'ClassNotFoundException: KafkaException' on Flink v1.9 w/ checkpointing
> -----------------------------------------------------------------------
>
>                 Key: FLINK-14076
>                 URL: https://issues.apache.org/jira/browse/FLINK-14076
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / Kafka
>    Affects Versions: 1.9.0
>            Reporter: Jeffrey Martin
>            Priority: Major
>         Attachments: error.txt
>
>
> A Flink job that worked with checkpointing on a Flink v1.8.0 cluster fails on 
> a Flink v1.9.0 cluster with checkpointing. It works on a Flink v1.9.0 cluster 
> _without_ checkpointing. It is specifically _enabling checkpointing on 
> v1.9.0_ that causes the JM to start throwing ClassNotFoundExceptions. Full 
> stacktrace: [^error.txt]
> The job reads from Kafka via FlinkKafkaConsumer and writes to Kafka via 
> FlinkKafkaProducer.
> The jobmanagers and taskmanagers are standalone.
> The exception is being raised deep in some Flink serialization code, so I'm 
> not sure how to go about stepping through this in a debugger. The issue is 
> happening in an internal repository at my job, but I can try to get a minimal 
> repro on GitHub if it's not obvious from the error message alone what's 
> broken.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to