[
https://issues.apache.org/jira/browse/SPARK-14228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16239854#comment-16239854
]
Danula Eranjith commented on SPARK-14228:
-----------------------------------------
I encountered the same issue in 1.6.3
{code}
17/11/03 22:38:33 WARN YarnSchedulerBackend$YarnSchedulerEndpoint: Executor for
container container_e02_1509517131757_0001_01_000003 exited because of a YARN
event (e.g., pre-emption) and not because of an error in the running job.
17/11/03 22:38:33 ERROR YarnClientSchedulerBackend: Could not find
CoarseGrainedScheduler or it has been stopped.
org.apache.spark.SparkException: Could not find CoarseGrainedScheduler or it
has been stopped.
at
org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:163)
at
org.apache.spark.rpc.netty.Dispatcher.postLocalMessage(Dispatcher.scala:128)
at org.apache.spark.rpc.netty.NettyRpcEnv.ask(NettyRpcEnv.scala:231)
at
org.apache.spark.rpc.netty.NettyRpcEndpointRef.ask(NettyRpcEnv.scala:515)
at org.apache.spark.rpc.RpcEndpointRef.ask(RpcEndpointRef.scala:62)
at
org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.removeExecutor(CoarseGrainedSchedulerBackend.scala:392)
at
org.apache.spark.scheduler.cluster.YarnSchedulerBackend$YarnSchedulerEndpoint$$anonfun$receive$1.applyOrElse(YarnSchedulerBackend.scala:259)
at
org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:116)
at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:204)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
at
org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:217)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
17/11/03 22:38:33 WARN YarnSchedulerBackend$YarnSchedulerEndpoint: Executor for
container container_e02_1509517131757_0001_01_000002 exited because of a YARN
event (e.g., pre-emption) and not because of an error in the running job.
17/11/03 22:38:33 ERROR YarnClientSchedulerBackend: Could not find
CoarseGrainedScheduler or it has been stopped.
org.apache.spark.SparkException: Could not find CoarseGrainedScheduler or it
has been stopped.
at
org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:163)
at
org.apache.spark.rpc.netty.Dispatcher.postLocalMessage(Dispatcher.scala:128)
at org.apache.spark.rpc.netty.NettyRpcEnv.ask(NettyRpcEnv.scala:231)
at
org.apache.spark.rpc.netty.NettyRpcEndpointRef.ask(NettyRpcEnv.scala:515)
at org.apache.spark.rpc.RpcEndpointRef.ask(RpcEndpointRef.scala:62)
at
org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.removeExecutor(CoarseGrainedSchedulerBackend.scala:392)
at
org.apache.spark.scheduler.cluster.YarnSchedulerBackend$YarnSchedulerEndpoint$$anonfun$receive$1.applyOrElse(YarnSchedulerBackend.scala:259)
at
org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:116)
at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:204)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
at
org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:217)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}
> Lost executor of RPC disassociated, and occurs exception: Could not find
> CoarseGrainedScheduler or it has been stopped
> ----------------------------------------------------------------------------------------------------------------------
>
> Key: SPARK-14228
> URL: https://issues.apache.org/jira/browse/SPARK-14228
> Project: Spark
> Issue Type: Bug
> Reporter: meiyoula
>
> When I start 1000 executors, and then stop the process. It will call
> SparkContext.stop to stop all executors. But during this process, the
> executors has been killed will lost of rpc with driver, and try to
> reviveOffers, but can't find CoarseGrainedScheduler or it has been stopped.
> {quote}
> 16/03/29 01:45:45 ERROR YarnScheduler: Lost executor 610 on 51-196-152-8:
> remote Rpc client disassociated
> 16/03/29 01:45:45 ERROR Inbox: Ignoring error
> org.apache.spark.SparkException: Could not find CoarseGrainedScheduler or it
> has been stopped.
> at
> org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:161)
> at
> org.apache.spark.rpc.netty.Dispatcher.postOneWayMessage(Dispatcher.scala:131)
> at org.apache.spark.rpc.netty.NettyRpcEnv.send(NettyRpcEnv.scala:173)
> at
> org.apache.spark.rpc.netty.NettyRpcEndpointRef.send(NettyRpcEnv.scala:398)
> at
> org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.reviveOffers(CoarseGrainedSchedulerBackend.scala:314)
> at
> org.apache.spark.scheduler.TaskSchedulerImpl.executorLost(TaskSchedulerImpl.scala:482)
> at
> org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverEndpoint.removeExecutor(CoarseGrainedSchedulerBackend.scala:261)
> at
> org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverEndpoint$$anonfun$onDisconnected$1.apply(CoarseGrainedSchedulerBackend.scala:207)
> at
> org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverEndpoint$$anonfun$onDisconnected$1.apply(CoarseGrainedSchedulerBackend.scala:207)
> at scala.Option.foreach(Option.scala:236)
> at
> org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverEndpoint.onDisconnected(CoarseGrainedSchedulerBackend.scala:207)
> at
> org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:144)
> at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:204)
> at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:102)
> at
> org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:215)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {quote}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]