Hi Amit, AFAIK, these exceptions are normal in HA mode as different JM instances are trying to acquire the lease.
Regards, Roman On Mon, Oct 25, 2021 at 1:45 PM Amit Bhatia <bhatia.amit1...@gmail.com> wrote: > > Hi, > > We have deployed two jobmanagers in HA mode on kubernetes using k8s configmap > solution with deployment controller. During Installation and after restart we > are getting below errors in standby jobmanager. > > 2021-10-25 11:17:46,397 ERROR > io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] > POD_NAME: eric-bss-em-sm-haflink-jobmanager-586d44dbbb-9v499 - Exception > occurred while acquiring lock 'ConfigMapLock: gautam - > eric-bss-em-sm-haflink-resourcemanager-leader > (ebfdc2b3-1097-41fc-a377-b1d0a7916690)' > io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.LockException: > Unable to create ConfigMapLock > at > io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.ConfigMapLock.create(ConfigMapLock.java:88) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.tryAcquireOrRenew(LeaderElector.java:138) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.lambda$acquire$0(LeaderElector.java:82) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.lambda$loop$3(LeaderElector.java:198) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [?:1.8.0_281] > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > [?:1.8.0_281] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > [?:1.8.0_281] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > [?:1.8.0_281] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [?:1.8.0_281] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [?:1.8.0_281] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_281] > Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure > executing: POST at: https://10.96.0.1/api/v1/namespaces/gautam/configmaps. > Message: configmaps "eric-bss-em-sm-haflink-resourcemanager-leader" already > exists. Received status: Status(apiVersion=v1, code=409, > details=StatusDetails(causes=[], group=null, kind=configmaps, > name=eric-bss-em-sm-haflink-resourcemanager-leader, retryAfterSeconds=null, > uid=null, additionalProperties={}), kind=Status, message=configmaps > "eric-bss-em-sm-haflink-resourcemanager-leader" already exists, > metadata=ListMeta(_continue=null, remainingItemCount=null, > resourceVersion=null, selfLink=null, additionalProperties={}), > reason=AlreadyExists, status=Failure, additionalProperties={}). > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:568) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:507) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:471) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:430) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:251) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:815) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:333) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.lambda$createNew$0(BaseOperation.java:346) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.api.model.DoneableConfigMap.done(DoneableConfigMap.java:26) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.ConfigMapLock.create(ConfigMapLock.java:86) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > ... 10 more > > > > > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/opt/flink/lib/log4j-slf4j-impl-2.12.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/opt/flink/lib/log4j-slf4j-impl-2.13.3.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] > 2021-09-17 09:24:45,742 ERROR > io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] > POD_NAME: eric-bss-em-sm-smdharam2-jobmanager-6f785d68b7-gkkjb - Exception > occurred while acquiring lock 'ConfigMapLock: r12d-mediation - > smdharam2-dispatcher-leader (b14658aa-2f69-4060-83c4-eb2b03d8edf5)' > io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.LockException: > Unable to update ConfigMapLock > at > io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.ConfigMapLock.update(ConfigMapLock.java:108) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.tryAcquireOrRenew(LeaderElector.java:156) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.lambda$acquire$0(LeaderElector.java:82) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.lambda$loop$3(LeaderElector.java:198) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [?:1.8.0_281] > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > [?:1.8.0_281] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > [?:1.8.0_281] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > [?:1.8.0_281] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [?:1.8.0_281] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [?:1.8.0_281] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_281] > Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure > executing: PUT at: > https://10.96.0.1/api/v1/namespaces/r12d-mediation/configmaps/smdharam2-dispatcher-leader. > Message: Operation cannot be fulfilled on configmaps > "smdharam2-dispatcher-leader": the object has been modified; please apply > your changes to the latest version and try again. Received status: > Status(apiVersion=v1, code=409, details=StatusDetails(causes=[], group=null, > kind=configmaps, name=smdharam2-dispatcher-leader, retryAfterSeconds=null, > uid=null, additionalProperties={}), kind=Status, message=Operation cannot be > fulfilled on configmaps "smdharam2-dispatcher-leader": the object has been > modified; please apply your changes to the latest version and try again, > metadata=ListMeta(_continue=null, remainingItemCount=null, > resourceVersion=null, selfLink=null, additionalProperties={}), > reason=Conflict, status=Failure, additionalProperties={}). > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:568) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:507) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:471) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:430) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleReplace(OperationSupport.java:289) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleReplace(OperationSupport.java:269) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleReplace(BaseOperation.java:820) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.lambda$replace$1(HasMetadataOperation.java:86) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.api.model.DoneableConfigMap.done(DoneableConfigMap.java:26) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.api.model.DoneableConfigMap.done(DoneableConfigMap.java:5) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:92) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:36) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.ConfigMapLock.update(ConfigMapLock.java:106) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > ... 10 more > 2021-09-17 09:24:45,742 ERROR > io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] > POD_NAME: eric-bss-em-sm-smdharam2-jobmanager-6f785d68b7-gkkjb - Exception > occurred while acquiring lock 'ConfigMapLock: r12d-mediation - > smdharam2-restserver-leader (b14658aa-2f69-4060-83c4-eb2b03d8edf5)' > io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.LockException: > Unable to update ConfigMapLock > at > io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.ConfigMapLock.update(ConfigMapLock.java:108) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.tryAcquireOrRenew(LeaderElector.java:156) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.lambda$acquire$0(LeaderElector.java:82) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.lambda$loop$3(LeaderElector.java:198) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [?:1.8.0_281] > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > [?:1.8.0_281] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > [?:1.8.0_281] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > [?:1.8.0_281] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [?:1.8.0_281] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [?:1.8.0_281] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_281] > Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure > executing: PUT at: > https://10.96.0.1/api/v1/namespaces/r12d-mediation/configmaps/smdharam2-restserver-leader. > Message: Operation cannot be fulfilled on configmaps > "smdharam2-restserver-leader": the object has been modified; please apply > your changes to the latest version and try again. Received status: > Status(apiVersion=v1, code=409, details=StatusDetails(causes=[], group=null, > kind=configmaps, name=smdharam2-restserver-leader, retryAfterSeconds=null, > uid=null, additionalProperties={}), kind=Status, message=Operation cannot be > fulfilled on configmaps "smdharam2-restserver-leader": the object has been > modified; please apply your changes to the latest version and try again, > metadata=ListMeta(_continue=null, remainingItemCount=null, > resourceVersion=null, selfLink=null, additionalProperties={}), > reason=Conflict, status=Failure, additionalProperties={}). > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:568) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:507) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:471) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:430) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleReplace(OperationSupport.java:289) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleReplace(OperationSupport.java:269) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleReplace(BaseOperation.java:820) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.lambda$replace$1(HasMetadataOperation.java:86) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.api.model.DoneableConfigMap.done(DoneableConfigMap.java:26) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.api.model.DoneableConfigMap.done(DoneableConfigMap.java:5) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:92) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:36) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.ConfigMapLock.update(ConfigMapLock.java:106) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > ... 10 more > > > So just wanted to confirm if this is safe to ignore these errors or do we > need to make some changes in configuration ? > > > Regards, > > Amit > >