The general exceptions here mean that components within the Spark cluster
can't communicate. The most common cause for this is failures of the
processors that are supposed to be communicating. I generally see this when
one of the processes goes into a GC storm or is shut down because of an
exception or something.

On Fri, Nov 20, 2020 at 10:52 AM Amit Sharma <resolve...@gmail.com> wrote:

> Russell i increased the rpc timeout to 240 seconds but i am still getting
> this issue once a while and after this issue my spark streaming job stuck
> and do not process any request then i need to restart this every time. Any
> suggestion please.
>
>
> Thanks
> Amit
>
> On Wed, Nov 18, 2020 at 12:05 PM Amit Sharma <resolve...@gmail.com> wrote:
>
>> Hi, we are running a spark streaming  job and sometimes it throws below
>> two exceptions . I am not understanding  what is the difference between
>> these two exception for one timeout is 120 seconds and another is 600
>> seconds. What could be the reason for these
>>
>>
>>  Error running job streaming job 1605709968000 ms.0
>> org.apache.spark.SparkException: Job aborted due to stage failure: Task
>> serialization failed: org.apache.spark.rpc.RpcTimeoutException: Futures
>> timed out after [120 seconds]. This timeout is controlled by
>> spark.rpc.askTimeout
>> org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120
>> seconds]. This timeout is controlled by spark.rpc.askTimeout
>>         at org.apache.spark.rpc.RpcTimeout.org
>> $apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:47)
>>         at
>> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:62)
>>         at
>> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:58)
>>         at
>> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
>>         at
>> org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
>>         at
>> org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:92)
>>         at
>> org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:76)
>>         at
>> org.apache.spark.storage.BlockManagerMaster.updateBlockInfo(BlockManagerMaster.scala:76)
>>         at org.apache.spark.storage.BlockManager.org
>> $apache$spark$storage$BlockManager$$tryToReportBlockStatus(BlockManager.scala:466)
>>         at org.apache.spark.storage.BlockManager.org
>> $apache$spark$storage$BlockManager$$reportBlockStatus(BlockManager.scala:445)
>>         at
>> org.apache.spark.storage.BlockManager.removeBlockInternal(BlockManager.scala:1519)
>>         at
>> org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1047)
>>
>>
>>
>>
>>
>> 2020-11-18 14:44:03 ERROR Utils:91 - Uncaught exception in thread
>> heartbeat-receiver-event-loop-thread
>> org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [600
>> seconds]. This timeout is controlled by BlockManagerHeartbeat
>>         at org.apache.spark.rpc.RpcTimeout.org
>> $apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:47)
>>         at
>> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:62)
>>         at
>> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:58)
>>         at
>> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
>>         at
>> org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
>>         at
>> org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:92)
>>         at
>> org.apache.spark.scheduler.DAGScheduler.executorHeartbeatReceived(DAGScheduler.scala:251)
>>         at
>> org.apache.spark.scheduler.TaskSchedulerImpl.executorHeartbeatReceived(TaskSchedulerImpl.scala:455)
>>         at
>> org.apache.spark.HeartbeatReceiver$$anonfun$receiveAndReply$1$$anon$2$$anonfun$run$2.apply$mcV$sp(HeartbeatReceiver.scala:129)
>>         at
>> org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1361)
>>         at
>> org.apache.spark.HeartbeatReceiver$$anonfun$receiveAndReply$1$$anon$2.run(HeartbeatReceiver.scala:128)
>>         at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>         at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>         at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>         at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>         at java.lang.Thread.run(Thread.java:748)
>>
>

Reply via email to