[ 
https://issues.apache.org/jira/browse/FLINK-25837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-25837.
------------------------------------
    Resolution: Duplicate

> RPC Serialization failure when multiple job managers
> ----------------------------------------------------
>
>                 Key: FLINK-25837
>                 URL: https://issues.apache.org/jira/browse/FLINK-25837
>             Project: Flink
>          Issue Type: Bug
>    Affects Versions: 1.14.3
>            Reporter: Matthew McMahon
>            Priority: Major
>
> I'm using 1.14.3 with the Lyft FlinkK8sOperator to run the cluster.
> Previously with 1.10.1 I had no problems, but now it seems when I have 
> multiple Job Managers deployed, I am continually seeing this error, that 
> prevents the Jobs from starting.
> {code}
> org.apache.flink.runtime.rpc.akka.exceptions.AkkaRpcException: Failed to 
> serialize the result for RPC call : requestMultipleJobDetails.
>       at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:417)
>       at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$sendAsyncResponse$2(AkkaRpcActor.java:373)
>       at 
> java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:836)
>       at 
> java.util.concurrent.CompletableFuture.uniHandleStage(CompletableFuture.java:848)
>       at 
> java.util.concurrent.CompletableFuture.handle(CompletableFuture.java:2168)
>       at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.sendAsyncResponse(AkkaRpcActor.java:365)
>       at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:332)
>       at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:217)
>       at 
> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:78)
>       at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:163)
>       at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:24)
>       at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:20)
>       at scala.PartialFunction.applyOrElse(PartialFunction.scala:123)
>       at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122)
>       at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:20)
>       at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>       at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
>       at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
>       at akka.actor.Actor.aroundReceive(Actor.scala:537)
>       at akka.actor.Actor.aroundReceive$(Actor.scala:535)
>       at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:220)
>       at akka.actor.ActorCell.receiveMessage(ActorCell.scala:580)
>       at akka.actor.ActorCell.invoke(ActorCell.scala:548)
>       at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270)
>       at akka.dispatch.Mailbox.run(Mailbox.scala:231)
>       at akka.dispatch.Mailbox.exec(Mailbox.scala:243)
>       at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
>       at 
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
>       at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
>       at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
> Caused by: java.io.NotSerializableException: java.util.HashMap$Values
>       at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
>       at 
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
>       at 
> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
>       at 
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
>       at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
>       at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
>       at 
> org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:632)
>       at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcSerializedValue.valueOf(AkkaRpcSerializedValue.java:66)
>       at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:400)
>       ... 29 more
> {code}
> I think it might be caused from 
> org.apache.flink.runtime.dispatcher.Dispatcher.java
> {code}
> return combinedJobDetails.thenApply(
>                 (Collection<JobDetails> runningJobDetails) -> {
>                     final Map<JobID, JobDetails> deduplicatedJobs = new 
> HashMap<>();
>                     completedJobDetails.forEach(job -> 
> deduplicatedJobs.put(job.getJobId(), job));
>                     runningJobDetails.forEach(job -> 
> deduplicatedJobs.put(job.getJobId(), job));
>                     return new MultipleJobsDetails(deduplicatedJobs.values());
>                 });
> {code}
> The MultipleJobsDetails result is a Collection, and apparently the HashMap 
> Values iterator is not Serializable.
> If I have only a single Job Manager, then it works. That is what I am doing 
> now.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to