[
https://issues.apache.org/jira/browse/FLINK-20798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
hayden zhou updated FLINK-20798:
--------------------------------
Description:
When deploying standalone Flink on Kubernetes and configure the
{{high-availability.storageDir}} to a mounted PVC directory, the Flink webui
could not be visited normally. It shows that "Service temporarily unavailable
due to an ongoing leader election. Please refresh".
The following is related logs from JobManager.
{code}
2020-12-29T06:45:54.177850394Z 2020-12-29 14:45:54,177 DEBUG
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] - Leader
election started
2020-12-29T06:45:54.177855303Z 2020-12-29 14:45:54,177 DEBUG
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
Attempting to acquire leader lease 'ConfigMapLock: default -
mta-flink-resourcemanager-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'...
2020-12-29T06:45:54.178668055Z 2020-12-29 14:45:54,178 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - WebSocket
successfully opened
2020-12-29T06:45:54.178895963Z 2020-12-29 14:45:54,178 INFO
org.apache.flink.runtime.leaderretrieval.DefaultLeaderRetrievalService [] -
Starting DefaultLeaderRetrievalService with
KubernetesLeaderRetrievalDriver\{configMapName='mta-flink-resourcemanager-leader'}.
2020-12-29T06:45:54.179327491Z 2020-12-29 14:45:54,179 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
Connecting websocket ...
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@6d303498
2020-12-29T06:45:54.230081993Z 2020-12-29 14:45:54,229 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - WebSocket
successfully opened
2020-12-29T06:45:54.230202329Z 2020-12-29 14:45:54,230 INFO
org.apache.flink.runtime.leaderretrieval.DefaultLeaderRetrievalService [] -
Starting DefaultLeaderRetrievalService with
KubernetesLeaderRetrievalDriver\{configMapName='mta-flink-dispatcher-leader'}.
2020-12-29T06:45:54.230219281Z 2020-12-29 14:45:54,229 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - WebSocket
successfully opened
2020-12-29T06:45:54.230353912Z 2020-12-29 14:45:54,230 INFO
org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] -
Starting DefaultLeaderElectionService with
KubernetesLeaderElectionDriver\{configMapName='mta-flink-resourcemanager-leader'}.
2020-12-29T06:45:54.237004177Z 2020-12-29 14:45:54,236 DEBUG
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] - Leader
changed from null to 6f6479c6-86cc-4d62-84f9-37ff968bd0e5
2020-12-29T06:45:54.237024655Z 2020-12-29 14:45:54,236 INFO
org.apache.flink.kubernetes.kubeclient.resources.KubernetesLeaderElector [] -
New leader elected 6f6479c6-86cc-4d62-84f9-37ff968bd0e5 for
mta-flink-restserver-leader.
2020-12-29T06:45:54.237027811Z 2020-12-29 14:45:54,236 DEBUG
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
Successfully Acquired leader lease 'ConfigMapLock: default -
mta-flink-restserver-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'
2020-12-29T06:45:54.237297376Z 2020-12-29 14:45:54,237 DEBUG
org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] - Grant
leadership to contender
[http://mta-flink-jobmanager:8081|http://mta-flink-jobmanager:8081/] with
session ID 9587e13f-322f-4cd5-9fff-b4941462be0f.
2020-12-29T06:45:54.237353551Z 2020-12-29 14:45:54,237 INFO
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint [] -
[http://mta-flink-jobmanager:8081|http://mta-flink-jobmanager:8081/] was
granted leadership with leaderSessionID=9587e13f-322f-4cd5-9fff-b4941462be0f
2020-12-29T06:45:54.237440354Z 2020-12-29 14:45:54,237 DEBUG
org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] -
Confirm leader session ID 9587e13f-322f-4cd5-9fff-b4941462be0f for leader
[http://mta-flink-jobmanager:8081|http://mta-flink-jobmanager:8081/].
2020-12-29T06:45:54.254555127Z 2020-12-29 14:45:54,254 DEBUG
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] - Leader
changed from null to 6f6479c6-86cc-4d62-84f9-37ff968bd0e5
2020-12-29T06:45:54.254588299Z 2020-12-29 14:45:54,254 INFO
org.apache.flink.kubernetes.kubeclient.resources.KubernetesLeaderElector [] -
New leader elected 6f6479c6-86cc-4d62-84f9-37ff968bd0e5 for
mta-flink-resourcemanager-leader.
2020-12-29T06:45:54.254628053Z 2020-12-29 14:45:54,254 DEBUG
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
Successfully Acquired leader lease 'ConfigMapLock: default -
mta-flink-resourcemanager-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'
2020-12-29T06:45:54.254871569Z 2020-12-29 14:45:54,254 DEBUG
org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] - Grant
leadership to contender LeaderContender: StandaloneResourceManager with session
ID b1730dc6-0f94-49f4-b519-56917f3027b7.
2020-12-29T06:45:54.256608291Z 2020-12-29 14:45:54,256 DEBUG
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
Attempting to renew leader lease 'ConfigMapLock: default -
mta-flink-resourcemanager-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'...
2020-12-29T06:45:54.259155793Z 2020-12-29 14:45:54,258 DEBUG
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] - Leader
changed from null to 6f6479c6-86cc-4d62-84f9-37ff968bd0e5
2020-12-29T06:45:54.259176091Z 2020-12-29 14:45:54,258 INFO
org.apache.flink.kubernetes.kubeclient.resources.KubernetesLeaderElector [] -
New leader elected 6f6479c6-86cc-4d62-84f9-37ff968bd0e5 for
mta-flink-dispatcher-leader.
2020-12-29T06:45:54.25918096Z 2020-12-29 14:45:54,259 DEBUG
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
Successfully Acquired leader lease 'ConfigMapLock: default -
mta-flink-dispatcher-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'
2020-12-29T06:45:54.259362149Z 2020-12-29 14:45:54,259 DEBUG
org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] - Grant
leadership to contender LeaderContender: DefaultDispatcherRunner with session
ID fbbaa883-69f6-43df-9ca0-c646bc1baad1.
2020-12-29T06:45:54.260301799Z 2020-12-29 14:45:54,260 DEBUG
org.apache.flink.runtime.dispatcher.runner.DefaultDispatcherRunner [] - Create
new DispatcherLeaderProcess with leader session id
fbbaa883-69f6-43df-9ca0-c646bc1baad1.
2020-12-29T06:45:54.266724597Z 2020-12-29 14:45:54,266 INFO
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess [] -
Start SessionDispatcherLeaderProcess.
2020-12-29T06:45:54.267718418Z 2020-12-29 14:45:54,267 DEBUG
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
Attempting to renew leader lease 'ConfigMapLock: default -
mta-flink-dispatcher-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'...
2020-12-29T06:45:54.26786349Z 2020-12-29 14:45:54,267 INFO
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess [] -
Recover all persisted job graphs.
2020-12-29T06:45:54.267976912Z 2020-12-29 14:45:54,267 DEBUG
org.apache.flink.runtime.jobmanager.DefaultJobGraphStore [] - Retrieving all
stored job ids from
KubernetesStateHandleStore\{configMapName='mta-flink-dispatcher-leader'}.
2020-12-29T06:45:54.277681598Z 2020-12-29 14:45:54,277 INFO
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] -
ResourceManager
akka.tcp://flink@mta-flink-jobmanager:6123/user/rpc/resourcemanager_0 was
granted leadership with fencing token b51956917f3027b7b1730dc60f9449f4
2020-12-29T06:45:54.280411279Z 2020-12-29 14:45:54,280 INFO
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManagerImpl [] -
Starting the SlotManager.
2020-12-29T06:45:54.281367931Z 2020-12-29 14:45:54,281 DEBUG
org.apache.flink.kubernetes.highavailability.KubernetesLeaderElectionDriver []
- Successfully wrote leader information:
Leader=[http://mta-flink-jobmanager:8081|http://mta-flink-jobmanager:8081/],
session ID=9587e13f-322f-4cd5-9fff-b4941462be0f.
2020-12-29T06:45:54.281528772Z 2020-12-29 14:45:54,281 DEBUG
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
Attempting to renew leader lease 'ConfigMapLock: default -
mta-flink-restserver-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'...
2020-12-29T06:45:54.286191344Z 2020-12-29 14:45:54,286 DEBUG
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - Trigger
heartbeat request.
2020-12-29T06:45:54.286304807Z 2020-12-29 14:45:54,286 DEBUG
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - Trigger
heartbeat request.
2020-12-29T06:45:54.286438227Z 2020-12-29 14:45:54,286 DEBUG
org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] -
Confirm leader session ID b1730dc6-0f94-49f4-b519-56917f3027b7 for leader
akka.tcp://flink@mta-flink-jobmanager:6123/user/rpc/resourcemanager_0.
2020-12-29T06:45:54.309361096Z 2020-12-29 14:45:54,309 DEBUG
org.apache.flink.kubernetes.highavailability.KubernetesLeaderElectionDriver []
- Successfully wrote leader information:
Leader=akka.tcp://flink@mta-flink-jobmanager:6123/user/rpc/resourcemanager_0,
session ID=b1730dc6-0f94-49f4-b519-56917f3027b7.
2020-12-29T06:45:54.320673232Z 2020-12-29 14:45:54,320 INFO
org.apache.flink.runtime.jobmanager.DefaultJobGraphStore [] - Retrieved job ids
[] from KubernetesStateHandleStore\{configMapName='mta-flink-dispatcher-leader'}
2020-12-29T06:45:54.3206989Z 2020-12-29 14:45:54,320 INFO
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess [] -
Successfully recovered 0 persisted job graphs.
2020-12-29T06:45:54.324829616Z 2020-12-29 14:45:54,324 DEBUG
org.apache.flink.runtime.rpc.akka.SupervisorActor [] - Starting
FencedAkkaRpcActor with name dispatcher_1.
2020-12-29T06:45:54.325343659Z 2020-12-29 14:45:54,325 INFO
org.apache.flink.runtime.rpc.akka.AkkaRpcService [] - Starting RPC endpoint for
org.apache.flink.runtime.dispatcher.StandaloneDispatcher at
akka://flink/user/rpc/dispatcher_1 .
2020-12-29T06:45:54.33778039Z 2020-12-29 14:45:54,337 DEBUG
org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] -
Confirm leader session ID fbbaa883-69f6-43df-9ca0-c646bc1baad1 for leader
akka.tcp://flink@mta-flink-jobmanager:6123/user/rpc/dispatcher_1.
2020-12-29T06:45:54.36249763Z 2020-12-29 14:45:54,362 DEBUG
org.apache.flink.kubernetes.highavailability.KubernetesLeaderElectionDriver []
- Successfully wrote leader information:
Leader=akka.tcp://flink@mta-flink-jobmanager:6123/user/rpc/dispatcher_1,
session ID=fbbaa883-69f6-43df-9ca0-c646bc1baad1.
2020-12-29T06:46:04.298366262Z 2020-12-29 14:46:04,297 DEBUG
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - Trigger
heartbeat request.
2020-12-29T06:46:04.298442695Z 2020-12-29 14:46:04,298 DEBUG
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - Trigger
heartbeat request.
2020-12-29T06:46:14.318174464Z 2020-12-29 14:46:14,317 DEBUG
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - Trigger
heartbeat request.
2020-12-29T06:46:14.318256849Z 2020-12-29 14:46:14,318 DEBUG
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - Trigger
heartbeat request.
2020-12-29T06:46:24.337694477Z 2020-12-29 14:46:24,337 DEBUG
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - Trigger
heartbeat request.
2020-12-29T06:46:24.337816516Z 2020-12-29 14:46:24,337 DEBUG
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - Trigger
heartbeat request.
2020-12-29T06:46:26.044624193Z 2020-12-29 14:46:26,044 DEBUG
org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBuf [] -
-Dorg.apache.flink.shaded.netty4.io.netty.buffer.checkAccessible: true
{code}
was:
我这边 部署 flink 到 k8s 使用 PVC 作为 high avalibility storagedir , 我看jobmanager
的日志,选举成功了。但是 web 一直显示选举进行中。
When deploying standalone Flink on Kubernetes and configure the
{{high-availability.storageDir}} to a mounted PVC directory, the Flink webui
could not be visited normally. It shows that "Service temporarily unavailable
due to an ongoing leader election. Please refresh".
下面是 jobmanager 的日志
The following is related logs from JobManager.
{code}
2020-12-29T06:45:54.177850394Z 2020-12-29 14:45:54,177 DEBUG
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] - Leader
election started
2020-12-29T06:45:54.177855303Z 2020-12-29 14:45:54,177 DEBUG
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
Attempting to acquire leader lease 'ConfigMapLock: default -
mta-flink-resourcemanager-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'...
2020-12-29T06:45:54.178668055Z 2020-12-29 14:45:54,178 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - WebSocket
successfully opened
2020-12-29T06:45:54.178895963Z 2020-12-29 14:45:54,178 INFO
org.apache.flink.runtime.leaderretrieval.DefaultLeaderRetrievalService [] -
Starting DefaultLeaderRetrievalService with
KubernetesLeaderRetrievalDriver\{configMapName='mta-flink-resourcemanager-leader'}.
2020-12-29T06:45:54.179327491Z 2020-12-29 14:45:54,179 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
Connecting websocket ...
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@6d303498
2020-12-29T06:45:54.230081993Z 2020-12-29 14:45:54,229 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - WebSocket
successfully opened
2020-12-29T06:45:54.230202329Z 2020-12-29 14:45:54,230 INFO
org.apache.flink.runtime.leaderretrieval.DefaultLeaderRetrievalService [] -
Starting DefaultLeaderRetrievalService with
KubernetesLeaderRetrievalDriver\{configMapName='mta-flink-dispatcher-leader'}.
2020-12-29T06:45:54.230219281Z 2020-12-29 14:45:54,229 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - WebSocket
successfully opened
2020-12-29T06:45:54.230353912Z 2020-12-29 14:45:54,230 INFO
org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] -
Starting DefaultLeaderElectionService with
KubernetesLeaderElectionDriver\{configMapName='mta-flink-resourcemanager-leader'}.
2020-12-29T06:45:54.237004177Z 2020-12-29 14:45:54,236 DEBUG
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] - Leader
changed from null to 6f6479c6-86cc-4d62-84f9-37ff968bd0e5
2020-12-29T06:45:54.237024655Z 2020-12-29 14:45:54,236 INFO
org.apache.flink.kubernetes.kubeclient.resources.KubernetesLeaderElector [] -
New leader elected 6f6479c6-86cc-4d62-84f9-37ff968bd0e5 for
mta-flink-restserver-leader.
2020-12-29T06:45:54.237027811Z 2020-12-29 14:45:54,236 DEBUG
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
Successfully Acquired leader lease 'ConfigMapLock: default -
mta-flink-restserver-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'
2020-12-29T06:45:54.237297376Z 2020-12-29 14:45:54,237 DEBUG
org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] - Grant
leadership to contender
[http://mta-flink-jobmanager:8081|http://mta-flink-jobmanager:8081/] with
session ID 9587e13f-322f-4cd5-9fff-b4941462be0f.
2020-12-29T06:45:54.237353551Z 2020-12-29 14:45:54,237 INFO
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint [] -
[http://mta-flink-jobmanager:8081|http://mta-flink-jobmanager:8081/] was
granted leadership with leaderSessionID=9587e13f-322f-4cd5-9fff-b4941462be0f
2020-12-29T06:45:54.237440354Z 2020-12-29 14:45:54,237 DEBUG
org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] -
Confirm leader session ID 9587e13f-322f-4cd5-9fff-b4941462be0f for leader
[http://mta-flink-jobmanager:8081|http://mta-flink-jobmanager:8081/].
2020-12-29T06:45:54.254555127Z 2020-12-29 14:45:54,254 DEBUG
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] - Leader
changed from null to 6f6479c6-86cc-4d62-84f9-37ff968bd0e5
2020-12-29T06:45:54.254588299Z 2020-12-29 14:45:54,254 INFO
org.apache.flink.kubernetes.kubeclient.resources.KubernetesLeaderElector [] -
New leader elected 6f6479c6-86cc-4d62-84f9-37ff968bd0e5 for
mta-flink-resourcemanager-leader.
2020-12-29T06:45:54.254628053Z 2020-12-29 14:45:54,254 DEBUG
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
Successfully Acquired leader lease 'ConfigMapLock: default -
mta-flink-resourcemanager-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'
2020-12-29T06:45:54.254871569Z 2020-12-29 14:45:54,254 DEBUG
org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] - Grant
leadership to contender LeaderContender: StandaloneResourceManager with session
ID b1730dc6-0f94-49f4-b519-56917f3027b7.
2020-12-29T06:45:54.256608291Z 2020-12-29 14:45:54,256 DEBUG
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
Attempting to renew leader lease 'ConfigMapLock: default -
mta-flink-resourcemanager-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'...
2020-12-29T06:45:54.259155793Z 2020-12-29 14:45:54,258 DEBUG
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] - Leader
changed from null to 6f6479c6-86cc-4d62-84f9-37ff968bd0e5
2020-12-29T06:45:54.259176091Z 2020-12-29 14:45:54,258 INFO
org.apache.flink.kubernetes.kubeclient.resources.KubernetesLeaderElector [] -
New leader elected 6f6479c6-86cc-4d62-84f9-37ff968bd0e5 for
mta-flink-dispatcher-leader.
2020-12-29T06:45:54.25918096Z 2020-12-29 14:45:54,259 DEBUG
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
Successfully Acquired leader lease 'ConfigMapLock: default -
mta-flink-dispatcher-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'
2020-12-29T06:45:54.259362149Z 2020-12-29 14:45:54,259 DEBUG
org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] - Grant
leadership to contender LeaderContender: DefaultDispatcherRunner with session
ID fbbaa883-69f6-43df-9ca0-c646bc1baad1.
2020-12-29T06:45:54.260301799Z 2020-12-29 14:45:54,260 DEBUG
org.apache.flink.runtime.dispatcher.runner.DefaultDispatcherRunner [] - Create
new DispatcherLeaderProcess with leader session id
fbbaa883-69f6-43df-9ca0-c646bc1baad1.
2020-12-29T06:45:54.266724597Z 2020-12-29 14:45:54,266 INFO
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess [] -
Start SessionDispatcherLeaderProcess.
2020-12-29T06:45:54.267718418Z 2020-12-29 14:45:54,267 DEBUG
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
Attempting to renew leader lease 'ConfigMapLock: default -
mta-flink-dispatcher-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'...
2020-12-29T06:45:54.26786349Z 2020-12-29 14:45:54,267 INFO
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess [] -
Recover all persisted job graphs.
2020-12-29T06:45:54.267976912Z 2020-12-29 14:45:54,267 DEBUG
org.apache.flink.runtime.jobmanager.DefaultJobGraphStore [] - Retrieving all
stored job ids from
KubernetesStateHandleStore\{configMapName='mta-flink-dispatcher-leader'}.
2020-12-29T06:45:54.277681598Z 2020-12-29 14:45:54,277 INFO
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] -
ResourceManager
akka.tcp://flink@mta-flink-jobmanager:6123/user/rpc/resourcemanager_0 was
granted leadership with fencing token b51956917f3027b7b1730dc60f9449f4
2020-12-29T06:45:54.280411279Z 2020-12-29 14:45:54,280 INFO
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManagerImpl [] -
Starting the SlotManager.
2020-12-29T06:45:54.281367931Z 2020-12-29 14:45:54,281 DEBUG
org.apache.flink.kubernetes.highavailability.KubernetesLeaderElectionDriver []
- Successfully wrote leader information:
Leader=[http://mta-flink-jobmanager:8081|http://mta-flink-jobmanager:8081/],
session ID=9587e13f-322f-4cd5-9fff-b4941462be0f.
2020-12-29T06:45:54.281528772Z 2020-12-29 14:45:54,281 DEBUG
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
Attempting to renew leader lease 'ConfigMapLock: default -
mta-flink-restserver-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'...
2020-12-29T06:45:54.286191344Z 2020-12-29 14:45:54,286 DEBUG
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - Trigger
heartbeat request.
2020-12-29T06:45:54.286304807Z 2020-12-29 14:45:54,286 DEBUG
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - Trigger
heartbeat request.
2020-12-29T06:45:54.286438227Z 2020-12-29 14:45:54,286 DEBUG
org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] -
Confirm leader session ID b1730dc6-0f94-49f4-b519-56917f3027b7 for leader
akka.tcp://flink@mta-flink-jobmanager:6123/user/rpc/resourcemanager_0.
2020-12-29T06:45:54.309361096Z 2020-12-29 14:45:54,309 DEBUG
org.apache.flink.kubernetes.highavailability.KubernetesLeaderElectionDriver []
- Successfully wrote leader information:
Leader=akka.tcp://flink@mta-flink-jobmanager:6123/user/rpc/resourcemanager_0,
session ID=b1730dc6-0f94-49f4-b519-56917f3027b7.
2020-12-29T06:45:54.320673232Z 2020-12-29 14:45:54,320 INFO
org.apache.flink.runtime.jobmanager.DefaultJobGraphStore [] - Retrieved job ids
[] from KubernetesStateHandleStore\{configMapName='mta-flink-dispatcher-leader'}
2020-12-29T06:45:54.3206989Z 2020-12-29 14:45:54,320 INFO
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess [] -
Successfully recovered 0 persisted job graphs.
2020-12-29T06:45:54.324829616Z 2020-12-29 14:45:54,324 DEBUG
org.apache.flink.runtime.rpc.akka.SupervisorActor [] - Starting
FencedAkkaRpcActor with name dispatcher_1.
2020-12-29T06:45:54.325343659Z 2020-12-29 14:45:54,325 INFO
org.apache.flink.runtime.rpc.akka.AkkaRpcService [] - Starting RPC endpoint for
org.apache.flink.runtime.dispatcher.StandaloneDispatcher at
akka://flink/user/rpc/dispatcher_1 .
2020-12-29T06:45:54.33778039Z 2020-12-29 14:45:54,337 DEBUG
org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] -
Confirm leader session ID fbbaa883-69f6-43df-9ca0-c646bc1baad1 for leader
akka.tcp://flink@mta-flink-jobmanager:6123/user/rpc/dispatcher_1.
2020-12-29T06:45:54.36249763Z 2020-12-29 14:45:54,362 DEBUG
org.apache.flink.kubernetes.highavailability.KubernetesLeaderElectionDriver []
- Successfully wrote leader information:
Leader=akka.tcp://flink@mta-flink-jobmanager:6123/user/rpc/dispatcher_1,
session ID=fbbaa883-69f6-43df-9ca0-c646bc1baad1.
2020-12-29T06:46:04.298366262Z 2020-12-29 14:46:04,297 DEBUG
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - Trigger
heartbeat request.
2020-12-29T06:46:04.298442695Z 2020-12-29 14:46:04,298 DEBUG
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - Trigger
heartbeat request.
2020-12-29T06:46:14.318174464Z 2020-12-29 14:46:14,317 DEBUG
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - Trigger
heartbeat request.
2020-12-29T06:46:14.318256849Z 2020-12-29 14:46:14,318 DEBUG
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - Trigger
heartbeat request.
2020-12-29T06:46:24.337694477Z 2020-12-29 14:46:24,337 DEBUG
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - Trigger
heartbeat request.
2020-12-29T06:46:24.337816516Z 2020-12-29 14:46:24,337 DEBUG
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - Trigger
heartbeat request.
2020-12-29T06:46:26.044624193Z 2020-12-29 14:46:26,044 DEBUG
org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBuf [] -
-Dorg.apache.flink.shaded.netty4.io.netty.buffer.checkAccessible: true
{code}
> Using PVC as high-availability.storageDir could not work
> --------------------------------------------------------
>
> Key: FLINK-20798
> URL: https://issues.apache.org/jira/browse/FLINK-20798
> Project: Flink
> Issue Type: Bug
> Components: Deployment / Kubernetes
> Affects Versions: 1.12.0
> Environment: FLINK 1.12.0
> Reporter: hayden zhou
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.13.0, 1.12.2
>
> Attachments: flink.log
>
>
> When deploying standalone Flink on Kubernetes and configure the
> {{high-availability.storageDir}} to a mounted PVC directory, the Flink webui
> could not be visited normally. It shows that "Service temporarily unavailable
> due to an ongoing leader election. Please refresh".
>
> The following is related logs from JobManager.
> {code}
> 2020-12-29T06:45:54.177850394Z 2020-12-29 14:45:54,177 DEBUG
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
> Leader election started
> 2020-12-29T06:45:54.177855303Z 2020-12-29 14:45:54,177 DEBUG
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
> Attempting to acquire leader lease 'ConfigMapLock: default -
> mta-flink-resourcemanager-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'...
> 2020-12-29T06:45:54.178668055Z 2020-12-29 14:45:54,178 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
> WebSocket successfully opened
> 2020-12-29T06:45:54.178895963Z 2020-12-29 14:45:54,178 INFO
> org.apache.flink.runtime.leaderretrieval.DefaultLeaderRetrievalService [] -
> Starting DefaultLeaderRetrievalService with
> KubernetesLeaderRetrievalDriver\{configMapName='mta-flink-resourcemanager-leader'}.
> 2020-12-29T06:45:54.179327491Z 2020-12-29 14:45:54,179 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
> Connecting websocket ...
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@6d303498
> 2020-12-29T06:45:54.230081993Z 2020-12-29 14:45:54,229 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
> WebSocket successfully opened
> 2020-12-29T06:45:54.230202329Z 2020-12-29 14:45:54,230 INFO
> org.apache.flink.runtime.leaderretrieval.DefaultLeaderRetrievalService [] -
> Starting DefaultLeaderRetrievalService with
> KubernetesLeaderRetrievalDriver\{configMapName='mta-flink-dispatcher-leader'}.
> 2020-12-29T06:45:54.230219281Z 2020-12-29 14:45:54,229 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
> WebSocket successfully opened
> 2020-12-29T06:45:54.230353912Z 2020-12-29 14:45:54,230 INFO
> org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] -
> Starting DefaultLeaderElectionService with
> KubernetesLeaderElectionDriver\{configMapName='mta-flink-resourcemanager-leader'}.
> 2020-12-29T06:45:54.237004177Z 2020-12-29 14:45:54,236 DEBUG
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
> Leader changed from null to 6f6479c6-86cc-4d62-84f9-37ff968bd0e5
> 2020-12-29T06:45:54.237024655Z 2020-12-29 14:45:54,236 INFO
> org.apache.flink.kubernetes.kubeclient.resources.KubernetesLeaderElector [] -
> New leader elected 6f6479c6-86cc-4d62-84f9-37ff968bd0e5 for
> mta-flink-restserver-leader.
> 2020-12-29T06:45:54.237027811Z 2020-12-29 14:45:54,236 DEBUG
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
> Successfully Acquired leader lease 'ConfigMapLock: default -
> mta-flink-restserver-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'
> 2020-12-29T06:45:54.237297376Z 2020-12-29 14:45:54,237 DEBUG
> org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] -
> Grant leadership to contender
> [http://mta-flink-jobmanager:8081|http://mta-flink-jobmanager:8081/] with
> session ID 9587e13f-322f-4cd5-9fff-b4941462be0f.
> 2020-12-29T06:45:54.237353551Z 2020-12-29 14:45:54,237 INFO
> org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint [] -
> [http://mta-flink-jobmanager:8081|http://mta-flink-jobmanager:8081/] was
> granted leadership with leaderSessionID=9587e13f-322f-4cd5-9fff-b4941462be0f
> 2020-12-29T06:45:54.237440354Z 2020-12-29 14:45:54,237 DEBUG
> org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] -
> Confirm leader session ID 9587e13f-322f-4cd5-9fff-b4941462be0f for leader
> [http://mta-flink-jobmanager:8081|http://mta-flink-jobmanager:8081/].
> 2020-12-29T06:45:54.254555127Z 2020-12-29 14:45:54,254 DEBUG
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
> Leader changed from null to 6f6479c6-86cc-4d62-84f9-37ff968bd0e5
> 2020-12-29T06:45:54.254588299Z 2020-12-29 14:45:54,254 INFO
> org.apache.flink.kubernetes.kubeclient.resources.KubernetesLeaderElector [] -
> New leader elected 6f6479c6-86cc-4d62-84f9-37ff968bd0e5 for
> mta-flink-resourcemanager-leader.
> 2020-12-29T06:45:54.254628053Z 2020-12-29 14:45:54,254 DEBUG
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
> Successfully Acquired leader lease 'ConfigMapLock: default -
> mta-flink-resourcemanager-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'
> 2020-12-29T06:45:54.254871569Z 2020-12-29 14:45:54,254 DEBUG
> org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] -
> Grant leadership to contender LeaderContender: StandaloneResourceManager with
> session ID b1730dc6-0f94-49f4-b519-56917f3027b7.
> 2020-12-29T06:45:54.256608291Z 2020-12-29 14:45:54,256 DEBUG
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
> Attempting to renew leader lease 'ConfigMapLock: default -
> mta-flink-resourcemanager-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'...
> 2020-12-29T06:45:54.259155793Z 2020-12-29 14:45:54,258 DEBUG
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
> Leader changed from null to 6f6479c6-86cc-4d62-84f9-37ff968bd0e5
> 2020-12-29T06:45:54.259176091Z 2020-12-29 14:45:54,258 INFO
> org.apache.flink.kubernetes.kubeclient.resources.KubernetesLeaderElector [] -
> New leader elected 6f6479c6-86cc-4d62-84f9-37ff968bd0e5 for
> mta-flink-dispatcher-leader.
> 2020-12-29T06:45:54.25918096Z 2020-12-29 14:45:54,259 DEBUG
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
> Successfully Acquired leader lease 'ConfigMapLock: default -
> mta-flink-dispatcher-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'
> 2020-12-29T06:45:54.259362149Z 2020-12-29 14:45:54,259 DEBUG
> org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] -
> Grant leadership to contender LeaderContender: DefaultDispatcherRunner with
> session ID fbbaa883-69f6-43df-9ca0-c646bc1baad1.
> 2020-12-29T06:45:54.260301799Z 2020-12-29 14:45:54,260 DEBUG
> org.apache.flink.runtime.dispatcher.runner.DefaultDispatcherRunner [] -
> Create new DispatcherLeaderProcess with leader session id
> fbbaa883-69f6-43df-9ca0-c646bc1baad1.
> 2020-12-29T06:45:54.266724597Z 2020-12-29 14:45:54,266 INFO
> org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess []
> - Start SessionDispatcherLeaderProcess.
> 2020-12-29T06:45:54.267718418Z 2020-12-29 14:45:54,267 DEBUG
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
> Attempting to renew leader lease 'ConfigMapLock: default -
> mta-flink-dispatcher-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'...
> 2020-12-29T06:45:54.26786349Z 2020-12-29 14:45:54,267 INFO
> org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess []
> - Recover all persisted job graphs.
> 2020-12-29T06:45:54.267976912Z 2020-12-29 14:45:54,267 DEBUG
> org.apache.flink.runtime.jobmanager.DefaultJobGraphStore [] - Retrieving all
> stored job ids from
> KubernetesStateHandleStore\{configMapName='mta-flink-dispatcher-leader'}.
> 2020-12-29T06:45:54.277681598Z 2020-12-29 14:45:54,277 INFO
> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] -
> ResourceManager
> akka.tcp://flink@mta-flink-jobmanager:6123/user/rpc/resourcemanager_0 was
> granted leadership with fencing token b51956917f3027b7b1730dc60f9449f4
> 2020-12-29T06:45:54.280411279Z 2020-12-29 14:45:54,280 INFO
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManagerImpl [] -
> Starting the SlotManager.
> 2020-12-29T06:45:54.281367931Z 2020-12-29 14:45:54,281 DEBUG
> org.apache.flink.kubernetes.highavailability.KubernetesLeaderElectionDriver
> [] - Successfully wrote leader information:
> Leader=[http://mta-flink-jobmanager:8081|http://mta-flink-jobmanager:8081/],
> session ID=9587e13f-322f-4cd5-9fff-b4941462be0f.
> 2020-12-29T06:45:54.281528772Z 2020-12-29 14:45:54,281 DEBUG
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] -
> Attempting to renew leader lease 'ConfigMapLock: default -
> mta-flink-restserver-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'...
> 2020-12-29T06:45:54.286191344Z 2020-12-29 14:45:54,286 DEBUG
> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] -
> Trigger heartbeat request.
> 2020-12-29T06:45:54.286304807Z 2020-12-29 14:45:54,286 DEBUG
> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] -
> Trigger heartbeat request.
> 2020-12-29T06:45:54.286438227Z 2020-12-29 14:45:54,286 DEBUG
> org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] -
> Confirm leader session ID b1730dc6-0f94-49f4-b519-56917f3027b7 for leader
> akka.tcp://flink@mta-flink-jobmanager:6123/user/rpc/resourcemanager_0.
> 2020-12-29T06:45:54.309361096Z 2020-12-29 14:45:54,309 DEBUG
> org.apache.flink.kubernetes.highavailability.KubernetesLeaderElectionDriver
> [] - Successfully wrote leader information:
> Leader=akka.tcp://flink@mta-flink-jobmanager:6123/user/rpc/resourcemanager_0,
> session ID=b1730dc6-0f94-49f4-b519-56917f3027b7.
> 2020-12-29T06:45:54.320673232Z 2020-12-29 14:45:54,320 INFO
> org.apache.flink.runtime.jobmanager.DefaultJobGraphStore [] - Retrieved job
> ids [] from
> KubernetesStateHandleStore\{configMapName='mta-flink-dispatcher-leader'}
> 2020-12-29T06:45:54.3206989Z 2020-12-29 14:45:54,320 INFO
> org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess []
> - Successfully recovered 0 persisted job graphs.
> 2020-12-29T06:45:54.324829616Z 2020-12-29 14:45:54,324 DEBUG
> org.apache.flink.runtime.rpc.akka.SupervisorActor [] - Starting
> FencedAkkaRpcActor with name dispatcher_1.
> 2020-12-29T06:45:54.325343659Z 2020-12-29 14:45:54,325 INFO
> org.apache.flink.runtime.rpc.akka.AkkaRpcService [] - Starting RPC endpoint
> for org.apache.flink.runtime.dispatcher.StandaloneDispatcher at
> akka://flink/user/rpc/dispatcher_1 .
> 2020-12-29T06:45:54.33778039Z 2020-12-29 14:45:54,337 DEBUG
> org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] -
> Confirm leader session ID fbbaa883-69f6-43df-9ca0-c646bc1baad1 for leader
> akka.tcp://flink@mta-flink-jobmanager:6123/user/rpc/dispatcher_1.
> 2020-12-29T06:45:54.36249763Z 2020-12-29 14:45:54,362 DEBUG
> org.apache.flink.kubernetes.highavailability.KubernetesLeaderElectionDriver
> [] - Successfully wrote leader information:
> Leader=akka.tcp://flink@mta-flink-jobmanager:6123/user/rpc/dispatcher_1,
> session ID=fbbaa883-69f6-43df-9ca0-c646bc1baad1.
> 2020-12-29T06:46:04.298366262Z 2020-12-29 14:46:04,297 DEBUG
> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] -
> Trigger heartbeat request.
> 2020-12-29T06:46:04.298442695Z 2020-12-29 14:46:04,298 DEBUG
> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] -
> Trigger heartbeat request.
> 2020-12-29T06:46:14.318174464Z 2020-12-29 14:46:14,317 DEBUG
> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] -
> Trigger heartbeat request.
> 2020-12-29T06:46:14.318256849Z 2020-12-29 14:46:14,318 DEBUG
> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] -
> Trigger heartbeat request.
> 2020-12-29T06:46:24.337694477Z 2020-12-29 14:46:24,337 DEBUG
> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] -
> Trigger heartbeat request.
> 2020-12-29T06:46:24.337816516Z 2020-12-29 14:46:24,337 DEBUG
> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] -
> Trigger heartbeat request.
> 2020-12-29T06:46:26.044624193Z 2020-12-29 14:46:26,044 DEBUG
> org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBuf [] -
> -Dorg.apache.flink.shaded.netty4.io.netty.buffer.checkAccessible: true
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)