[ 
https://issues.apache.org/jira/browse/BEAM-13164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17438233#comment-17438233
 ] 

Luke Cwik commented on BEAM-13164:
----------------------------------

Looks like the SDK harness is trying to connect to Spark for an unknown 
endpoint. This implies that the server is unaware of the endpoint that it told 
the SDK harness to connect on.


{noformat}
21/11/03 11:11:10 INFO 
org.apache.beam.runners.fnexecution.logging.GrpcLoggingService: 1 Beam Fn 
Logging clients still connected during shutdown.
21/11/03 11:11:10 WARN org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer: 
Hanged up for unknown endpoint.
21/11/03 11:11:10 ERROR org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer2: 
Failed to handle for url: "InProcessServer_328"

org.apache.beam.vendor.grpc.v1p36p0.io.grpc.StatusRuntimeException: CANCELLED: 
Multiplexer hanging up
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.Status.asRuntimeException(Status.java:535)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:478)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:553)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:68)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:739)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:718)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Shutting SDK harness down.
21/11/03 11:11:21 WARN org.apache.spark.executor.Executor: Issue communicating 
with driver in heartbeater
org.apache.spark.SparkException: Exception thrown in awaitResult: 
        at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:301)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
        at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:103)
        at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:87)
        at 
org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:78)
        at 
org.apache.spark.storage.BlockManager.reregister(BlockManager.scala:589)
        at 
org.apache.spark.executor.Executor.reportHeartBeat(Executor.scala:1000)
        at 
org.apache.spark.executor.Executor.$anonfun$heartbeater$1(Executor.scala:212)
        at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
        at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1996)
        at org.apache.spark.Heartbeater$$anon$1.run(Heartbeater.scala:46)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:296)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
        at 
org.apache.spark.storage.BlockManagerMasterEndpoint.org$apache$spark$storage$BlockManagerMasterEndpoint$$register(BlockManagerMasterEndpoint.scala:524)
        at 
org.apache.spark.storage.BlockManagerMasterEndpoint$$anonfun$receiveAndReply$1.applyOrElse(BlockManagerMasterEndpoint.scala:116)
        at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:103)
        at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:213)
        at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
        at 
org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.scala:75)
        at 
org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:41)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        ... 3 more
21/11/03 11:11:21 ERROR org.apache.spark.rpc.netty.Inbox: Ignoring error
java.lang.NullPointerException
        at 
org.apache.spark.storage.BlockManagerMasterEndpoint.org$apache$spark$storage$BlockManagerMasterEndpoint$$register(BlockManagerMasterEndpoint.scala:524)
        at 
org.apache.spark.storage.BlockManagerMasterEndpoint$$anonfun$receiveAndReply$1.applyOrElse(BlockManagerMasterEndpoint.scala:116)
        at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:103)
        at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:213)
        at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
        at 
org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.scala:75)
        at 
org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:41)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
21/11/03 11:11:26 INFO org.apache.beam.runners.spark.SparkPipelineRunner: 
Running job 
combinetest0windowingtests0testslidingwindowscombine-lcwik-1103181102-2f9939bb_0c5eafd2-ea83-49df-8b03-6d7b8bbb4328
 on Spark master local[4]
21/11/03 11:11:26 WARN 
org.apache.beam.runners.spark.translation.GroupNonMergingWindowsFunctions: 
Either coder LengthPrefixCoder(ByteArrayCoder) or 
IntervalWindow$IntervalWindowCoder is not consistent with equals. That might 
cause issues on some runners.
21/11/03 11:11:26 WARN 
org.apache.beam.runners.spark.translation.GroupNonMergingWindowsFunctions: 
Either coder LengthPrefixCoder(ByteArrayCoder) or GlobalWindow$Coder is not 
consistent with equals. That might cause issues on some runners.
21/11/03 11:11:26 INFO org.apache.beam.runners.spark.SparkPipelineRunner: Job 
combinetest0windowingtests0testslidingwindowscombine-lcwik-1103181102-2f9939bb_0c5eafd2-ea83-49df-8b03-6d7b8bbb4328:
 Pipeline translated successfully. Computing outputs
21/11/03 11:11:27 INFO org.apache.beam.fn.harness.FnHarness: Fn Harness started
21/11/03 11:11:27 INFO 
org.apache.beam.runners.fnexecution.logging.GrpcLoggingService: Beam Fn Logging 
client connected.
21/11/03 11:11:27 INFO org.apache.beam.fn.harness.FnHarness: Entering 
instruction processing loop
21/11/03 11:11:27 INFO 
org.apache.beam.runners.fnexecution.control.FnApiControlClientPoolService: Beam 
Fn Control client connected with id 56-1
21/11/03 11:11:27 INFO 
org.apache.beam.runners.fnexecution.control.FnApiControlClientPoolService: 
getProcessBundleDescriptor request with id 56-2
21/11/03 11:11:27 INFO 
org.apache.beam.runners.fnexecution.data.GrpcDataService: Beam Fn Data client 
connected.
21/11/03 11:11:27 ERROR org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer: 
Failed to handle for unknown endpoint
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.StatusRuntimeException: CANCELLED: 
client cancelled
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.Status.asRuntimeException(Status.java:526)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onCancel(ServerCalls.java:284)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.PartialForwardingServerCallListener.onCancel(PartialForwardingServerCallListener.java:40)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.ForwardingServerCallListener.onCancel(ForwardingServerCallListener.java:23)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.ForwardingServerCallListener$SimpleForwardingServerCallListener.onCancel(ForwardingServerCallListener.java:40)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.Contexts$ContextualizedServerCallListener.onCancel(Contexts.java:96)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.closedInternal(ServerCallImpl.java:353)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.closed(ServerCallImpl.java:341)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1Closed.runInContext(ServerImpl.java:844)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
21/11/03 11:11:27 ERROR org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer2: 
Failed to handle for url: "InProcessServer_334"

org.apache.beam.vendor.grpc.v1p36p0.io.grpc.StatusRuntimeException: CANCELLED: 
Failed to read message.
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.Status.asRuntimeException(Status.java:535)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:478)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:553)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:68)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:739)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:718)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
        at 
org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer2$InboundObserver.forwardToConsumerForInstructionId(BeamFnDataGrpcMultiplexer2.java:213)
        at 
org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer2$InboundObserver.onNext(BeamFnDataGrpcMultiplexer2.java:184)
        at 
org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer2$InboundObserver.onNext(BeamFnDataGrpcMultiplexer2.java:157)
        at 
org.apache.beam.sdk.fn.stream.ForwardingClientResponseObserver.onNext(ForwardingClientResponseObserver.java:49)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onMessage(ClientCalls.java:465)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInternal(ClientCallImpl.java:652)
        at 
org.apache.beam.vendor.grpc.v1p36p0.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInContext(ClientCallImpl.java:637)
        ... 5 more

{noformat}


> beam_PostCommit_Java_PVR_Spark_Batch timing out
> -----------------------------------------------
>
>                 Key: BEAM-13164
>                 URL: https://issues.apache.org/jira/browse/BEAM-13164
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-spark
>            Reporter: Andrew Pilloud
>            Assignee: Luke Cwik
>            Priority: P1
>          Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Looks like this went from being a flake to a hard failure: 
> https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/
> https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/5009/
> 18:41:18 Build timed out (after 100 minutes). Marking the build as aborted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to