[
https://issues.apache.org/jira/browse/FLINK-25837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chesnay Schepler closed FLINK-25837.
------------------------------------
Resolution: Duplicate
> RPC Serialization failure when multiple job managers
> ----------------------------------------------------
>
> Key: FLINK-25837
> URL: https://issues.apache.org/jira/browse/FLINK-25837
> Project: Flink
> Issue Type: Bug
> Affects Versions: 1.14.3
> Reporter: Matthew McMahon
> Priority: Major
>
> I'm using 1.14.3 with the Lyft FlinkK8sOperator to run the cluster.
> Previously with 1.10.1 I had no problems, but now it seems when I have
> multiple Job Managers deployed, I am continually seeing this error, that
> prevents the Jobs from starting.
> {code}
> org.apache.flink.runtime.rpc.akka.exceptions.AkkaRpcException: Failed to
> serialize the result for RPC call : requestMultipleJobDetails.
> at
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:417)
> at
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$sendAsyncResponse$2(AkkaRpcActor.java:373)
> at
> java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:836)
> at
> java.util.concurrent.CompletableFuture.uniHandleStage(CompletableFuture.java:848)
> at
> java.util.concurrent.CompletableFuture.handle(CompletableFuture.java:2168)
> at
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.sendAsyncResponse(AkkaRpcActor.java:365)
> at
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:332)
> at
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:217)
> at
> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:78)
> at
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:163)
> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:24)
> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:20)
> at scala.PartialFunction.applyOrElse(PartialFunction.scala:123)
> at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122)
> at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:20)
> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
> at akka.actor.Actor.aroundReceive(Actor.scala:537)
> at akka.actor.Actor.aroundReceive$(Actor.scala:535)
> at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:220)
> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:580)
> at akka.actor.ActorCell.invoke(ActorCell.scala:548)
> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270)
> at akka.dispatch.Mailbox.run(Mailbox.scala:231)
> at akka.dispatch.Mailbox.exec(Mailbox.scala:243)
> at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> at
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
> at
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
> Caused by: java.io.NotSerializableException: java.util.HashMap$Values
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
> at
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
> at
> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
> at
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> at
> org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:632)
> at
> org.apache.flink.runtime.rpc.akka.AkkaRpcSerializedValue.valueOf(AkkaRpcSerializedValue.java:66)
> at
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:400)
> ... 29 more
> {code}
> I think it might be caused from
> org.apache.flink.runtime.dispatcher.Dispatcher.java
> {code}
> return combinedJobDetails.thenApply(
> (Collection<JobDetails> runningJobDetails) -> {
> final Map<JobID, JobDetails> deduplicatedJobs = new
> HashMap<>();
> completedJobDetails.forEach(job ->
> deduplicatedJobs.put(job.getJobId(), job));
> runningJobDetails.forEach(job ->
> deduplicatedJobs.put(job.getJobId(), job));
> return new MultipleJobsDetails(deduplicatedJobs.values());
> });
> {code}
> The MultipleJobsDetails result is a Collection, and apparently the HashMap
> Values iterator is not Serializable.
> If I have only a single Job Manager, then it works. That is what I am doing
> now.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)