[
https://issues.apache.org/jira/browse/FLINK-27856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545240#comment-17545240
]
Yang Wang commented on FLINK-27856:
-----------------------------------
We will have a NPE in the JobManager if the {{spec}} is not configured in the
pod template. This ticket needs to be fixed in the Flink project.
{code:java}
2022-06-02 02:26:11,864 ERROR
org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Fatal error
occurred in the cluster entrypoint.
org.apache.flink.runtime.resourcemanager.exceptions.ResourceManagerException:
Could not start the ResourceManager
akka.tcp://[email protected]:6123/user/rpc/resourcemanager_1
at
org.apache.flink.runtime.resourcemanager.ResourceManager.onStart(ResourceManager.java:223)
~[flink-dist-1.15.0.jar:1.15.0]
at
org.apache.flink.runtime.rpc.RpcEndpoint.internalCallOnStart(RpcEndpoint.java:181)
~[flink-dist-1.15.0.jar:1.15.0]
at
org.apache.flink.runtime.rpc.akka.AkkaRpcActor$StoppedState.lambda$start$0(AkkaRpcActor.java:612)
~[flink-rpc-akka_ee3c2e1e-3cab-4d23-8b5d-cdf80a18084e.jar:1.15.0]
at
org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68)
~[flink-rpc-akka_ee3c2e1e-3cab-4d23-8b5d-cdf80a18084e.jar:1.15.0]
at
org.apache.flink.runtime.rpc.akka.AkkaRpcActor$StoppedState.start(AkkaRpcActor.java:611)
~[flink-rpc-akka_ee3c2e1e-3cab-4d23-8b5d-cdf80a18084e.jar:1.15.0]
at
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleControlMessage(AkkaRpcActor.java:185)
~[flink-rpc-akka_ee3c2e1e-3cab-4d23-8b5d-cdf80a18084e.jar:1.15.0]
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:24)
[flink-rpc-akka_ee3c2e1e-3cab-4d23-8b5d-cdf80a18084e.jar:1.15.0]
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:20)
[flink-rpc-akka_ee3c2e1e-3cab-4d23-8b5d-cdf80a18084e.jar:1.15.0]
at scala.PartialFunction.applyOrElse(PartialFunction.scala:123)
[flink-rpc-akka_ee3c2e1e-3cab-4d23-8b5d-cdf80a18084e.jar:1.15.0]
at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122)
[flink-rpc-akka_ee3c2e1e-3cab-4d23-8b5d-cdf80a18084e.jar:1.15.0]
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:20)
[flink-rpc-akka_ee3c2e1e-3cab-4d23-8b5d-cdf80a18084e.jar:1.15.0]
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
[flink-rpc-akka_ee3c2e1e-3cab-4d23-8b5d-cdf80a18084e.jar:1.15.0]
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
[flink-rpc-akka_ee3c2e1e-3cab-4d23-8b5d-cdf80a18084e.jar:1.15.0]
at akka.actor.Actor.aroundReceive(Actor.scala:537)
[flink-rpc-akka_ee3c2e1e-3cab-4d23-8b5d-cdf80a18084e.jar:1.15.0]
at akka.actor.Actor.aroundReceive$(Actor.scala:535)
[flink-rpc-akka_ee3c2e1e-3cab-4d23-8b5d-cdf80a18084e.jar:1.15.0]
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:220)
[flink-rpc-akka_ee3c2e1e-3cab-4d23-8b5d-cdf80a18084e.jar:1.15.0]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:580)
[flink-rpc-akka_ee3c2e1e-3cab-4d23-8b5d-cdf80a18084e.jar:1.15.0]
at akka.actor.ActorCell.invoke(ActorCell.scala:548)
[flink-rpc-akka_ee3c2e1e-3cab-4d23-8b5d-cdf80a18084e.jar:1.15.0]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270)
[flink-rpc-akka_ee3c2e1e-3cab-4d23-8b5d-cdf80a18084e.jar:1.15.0]
at akka.dispatch.Mailbox.run(Mailbox.scala:231)
[flink-rpc-akka_ee3c2e1e-3cab-4d23-8b5d-cdf80a18084e.jar:1.15.0]
at akka.dispatch.Mailbox.exec(Mailbox.scala:243)
[flink-rpc-akka_ee3c2e1e-3cab-4d23-8b5d-cdf80a18084e.jar:1.15.0]
at java.util.concurrent.ForkJoinTask.doExec(Unknown Source) [?:?]
at java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown
Source) [?:?]
at java.util.concurrent.ForkJoinPool.scan(Unknown Source) [?:?]
at java.util.concurrent.ForkJoinPool.runWorker(Unknown Source) [?:?]
at java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source) [?:?]
Caused by:
org.apache.flink.runtime.resourcemanager.exceptions.ResourceManagerException:
Cannot initialize resource provider.
at
org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager.initialize(ActiveResourceManager.java:158)
~[flink-dist-1.15.0.jar:1.15.0]
at
org.apache.flink.runtime.resourcemanager.ResourceManager.startResourceManagerServices(ResourceManager.java:241)
~[flink-dist-1.15.0.jar:1.15.0]
at
org.apache.flink.runtime.resourcemanager.ResourceManager.onStart(ResourceManager.java:218)
~[flink-dist-1.15.0.jar:1.15.0]
... 25 more
Caused by: java.lang.NullPointerException
at
org.apache.flink.kubernetes.utils.KubernetesUtils.loadPodFromTemplateFile(KubernetesUtils.java:421)
~[flink-dist-1.15.0.jar:1.15.0]
at
org.apache.flink.kubernetes.KubernetesResourceManagerDriver.initializeInternal(KubernetesResourceManagerDriver.java:115)
~[flink-dist-1.15.0.jar:1.15.0]
at
org.apache.flink.runtime.resourcemanager.active.AbstractResourceManagerDriver.initialize(AbstractResourceManagerDriver.java:81)
~[flink-dist-1.15.0.jar:1.15.0]
at
org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager.initialize(ActiveResourceManager.java:156)
~[flink-dist-1.15.0.jar:1.15.0]
at
org.apache.flink.runtime.resourcemanager.ResourceManager.startResourceManagerServices(ResourceManager.java:241)
~[flink-dist-1.15.0.jar:1.15.0]
at
org.apache.flink.runtime.resourcemanager.ResourceManager.onStart(ResourceManager.java:218)
~[flink-dist-1.15.0.jar:1.15.0] {code}
> Adding pod template without spec crashes job manager
> ----------------------------------------------------
>
> Key: FLINK-27856
> URL: https://issues.apache.org/jira/browse/FLINK-27856
> Project: Flink
> Issue Type: Bug
> Components: Deployment / Kubernetes, Kubernetes Operator
> Affects Versions: kubernetes-operator-0.1.0, kubernetes-operator-1.0.0
> Reporter: Jeesmon Jacob
> Priority: Minor
> Fix For: kubernetes-operator-1.1.0
>
>
> While trying to add Pod annotation through pod template in FlinkDeployment,
> taskmanager was keep crashing.
> Pod template that I used:
> {code:java}
> taskManager:
> podTemplate:
> apiVersion: v1
> kind: Pod
> metadata:
> annotations:
> iam.amazonaws.com/role: fake-role-arn
> {code}
> It created below ConfigMap and mounted to the deployment:
> {code:java}
> apiVersion: v1
> data:
> taskmanager-pod-template.yaml: |
> ---
> apiVersion: "v1"
> kind: "Pod"
> metadata:
> annotations:
> iam.amazonaws.com/role: "fake-role-arn"
> kind: ConfigMap
> {code}
> Looks like missing "spec" stanza in pod template resulted in the crash and I
> couldn't find any documentation that "spec" is required for pod template even
> for just adding metadata annotations.
> Adding below worked fine
> {code:java}
> taskManager:
> podTemplate:
> apiVersion: v1
> kind: Pod
> metadata:
> annotations:
> iam.amazonaws.com/role: fake-role-arn
> spec: {}
> {code}
> Corresponding ConfigMap
> {code:java}
> apiVersion: v1
> data:
> taskmanager-pod-template.yaml: |
> ---
> apiVersion: "v1"
> kind: "Pod"
> metadata:
> annotations:
> iam.amazonaws.com/role: "fake-role-arn"
> spec:
> containers: []
> {code}
--
This message was sent by Atlassian Jira
(v8.20.7#820007)