[
https://issues.apache.org/jira/browse/FLINK-21472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17289790#comment-17289790
]
Matthias commented on FLINK-21472:
----------------------------------
There seem to be some connection issue to retrieve the leader information:
{code}
2021-02-24 09:48:09,749 WARN
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - Exec
Failure
java.net.SocketTimeoutException: sent ping but didn't receive pong within
30000ms (after 0 successful ping/pongs)
at
org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.RealWebSocket.writePingFrame(RealWebSocket.java:546)
[flink-kubernetes_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.RealWebSocket$PingRunnable.run(RealWebSocket.java:530)
[flink-kubernetes_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[?:1.8.0_282]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
[?:1.8.0_282]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
[?:1.8.0_282]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
[?:1.8.0_282]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_282]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_282]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282]
2021-02-24 09:48:10,761 INFO
org.apache.flink.kubernetes.highavailability.KubernetesLeaderElectionDriver []
- Creating a new watch on ConfigMap mta-flink-dispatcher-leader.
2021-02-24 09:49:10,764 WARN
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - Exec
Failure
java.net.SocketTimeoutException: sent ping but didn't receive pong within
30000ms (after 0 successful ping/pongs)
{code}
The strange thing is that the {{FencingTokenExceptions}} occur even before the
30 secs of the timeout are passed. [~fly_in_gis] are you able to get anything
else out of the logs?
> FencingTokenException: Fencing token mismatch
> ---------------------------------------------
>
> Key: FLINK-21472
> URL: https://issues.apache.org/jira/browse/FLINK-21472
> Project: Flink
> Issue Type: Bug
> Components: Deployment / Kubernetes
> Affects Versions: 1.12.1
> Reporter: hayden zhou
> Priority: Major
> Attachments:
> flink--standalonesession-0-mta-flink-jobmanager-864d6c8cbb-rmsxw.log
>
>
> org.apache.flink.runtime.rest.handler.job.JobsOverviewHandler [] - Unhandled
> exception.
> org.apache.flink.runtime.rpc.exceptions.FencingTokenException: Fencing token
> mismatch: Ignoring message
> LocalFencedMessage(8fac01d8e3e3988223a2e5c6e3f04f1e,
> LocalRpcInvocation(requestMultipleJobDetails(Time))) because the fencing
> token 8fac01d8e3e3988223a2e5c6e3f04f1e did not match the expected fencing
> token 8c37414f464bca76144e6cabc946474b.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)