The general exceptions here mean that components within the Spark cluster can't communicate. The most common cause for this is failures of the processors that are supposed to be communicating. I generally see this when one of the processes goes into a GC storm or is shut down because of an exception or something.
On Fri, Nov 20, 2020 at 10:52 AM Amit Sharma <resolve...@gmail.com> wrote: > Russell i increased the rpc timeout to 240 seconds but i am still getting > this issue once a while and after this issue my spark streaming job stuck > and do not process any request then i need to restart this every time. Any > suggestion please. > > > Thanks > Amit > > On Wed, Nov 18, 2020 at 12:05 PM Amit Sharma <resolve...@gmail.com> wrote: > >> Hi, we are running a spark streaming job and sometimes it throws below >> two exceptions . I am not understanding what is the difference between >> these two exception for one timeout is 120 seconds and another is 600 >> seconds. What could be the reason for these >> >> >> Error running job streaming job 1605709968000 ms.0 >> org.apache.spark.SparkException: Job aborted due to stage failure: Task >> serialization failed: org.apache.spark.rpc.RpcTimeoutException: Futures >> timed out after [120 seconds]. This timeout is controlled by >> spark.rpc.askTimeout >> org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 >> seconds]. This timeout is controlled by spark.rpc.askTimeout >> at org.apache.spark.rpc.RpcTimeout.org >> $apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:47) >> at >> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:62) >> at >> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:58) >> at >> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) >> at >> org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76) >> at >> org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:92) >> at >> org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:76) >> at >> org.apache.spark.storage.BlockManagerMaster.updateBlockInfo(BlockManagerMaster.scala:76) >> at org.apache.spark.storage.BlockManager.org >> $apache$spark$storage$BlockManager$$tryToReportBlockStatus(BlockManager.scala:466) >> at org.apache.spark.storage.BlockManager.org >> $apache$spark$storage$BlockManager$$reportBlockStatus(BlockManager.scala:445) >> at >> org.apache.spark.storage.BlockManager.removeBlockInternal(BlockManager.scala:1519) >> at >> org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1047) >> >> >> >> >> >> 2020-11-18 14:44:03 ERROR Utils:91 - Uncaught exception in thread >> heartbeat-receiver-event-loop-thread >> org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [600 >> seconds]. This timeout is controlled by BlockManagerHeartbeat >> at org.apache.spark.rpc.RpcTimeout.org >> $apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:47) >> at >> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:62) >> at >> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:58) >> at >> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) >> at >> org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76) >> at >> org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:92) >> at >> org.apache.spark.scheduler.DAGScheduler.executorHeartbeatReceived(DAGScheduler.scala:251) >> at >> org.apache.spark.scheduler.TaskSchedulerImpl.executorHeartbeatReceived(TaskSchedulerImpl.scala:455) >> at >> org.apache.spark.HeartbeatReceiver$$anonfun$receiveAndReply$1$$anon$2$$anonfun$run$2.apply$mcV$sp(HeartbeatReceiver.scala:129) >> at >> org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1361) >> at >> org.apache.spark.HeartbeatReceiver$$anonfun$receiveAndReply$1$$anon$2.run(HeartbeatReceiver.scala:128) >> at >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) >> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) >> at java.lang.Thread.run(Thread.java:748) >> >