[jira] [Commented] (SPARK-20814) Mesos scheduler does not respect spark.executor.extraClassPath configuration

2017-07-03 Thread Laurent Hoss (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072244#comment-16072244
 ] 

Laurent Hoss commented on SPARK-20814:
--

> Hmm, this sucks, we should fix it for 2.2 !
+1 ( for the whole spark-on-mesos community )
And this patch looks straightforward, without risk to break anything (not 
related to mesos), so why it didn't make it into 2.2.0 ?!


> Mesos scheduler does not respect spark.executor.extraClassPath configuration
> 
>
> Key: SPARK-20814
> URL: https://issues.apache.org/jira/browse/SPARK-20814
> Project: Spark
>  Issue Type: Bug
>  Components: Mesos
>Affects Versions: 2.2.0
>Reporter: Gene Pang
>Assignee: Marcelo Vanzin
>Priority: Critical
> Fix For: 2.2.1
>
>
> When Spark executors are deployed on Mesos, the Mesos scheduler no longer 
> respects the "spark.executor.extraClassPath" configuration parameter.
> MesosCoarseGrainedSchedulerBackend used to use the environment variable 
> "SPARK_CLASSPATH" to add the value of "spark.executor.extraClassPath" to the 
> executor classpath. However, "SPARK_CLASSPATH" was deprecated, and was 
> removed in this commit 
> [https://github.com/apache/spark/commit/8f0490e22b4c7f1fdf381c70c5894d46b7f7e6fb#diff-387c5d0c916278495fc28420571adf9eL178].
> This effectively broke the ability for users to specify 
> "spark.executor.extraClassPath" for Spark executors deployed on Mesos.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19606) Support constraints in spark-dispatcher

2017-05-31 Thread Laurent Hoss (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16031146#comment-16031146
 ] 

Laurent Hoss commented on SPARK-19606:
--

+1
hopefully this PR gets comitter attention

> Support constraints in spark-dispatcher
> ---
>
> Key: SPARK-19606
> URL: https://issues.apache.org/jira/browse/SPARK-19606
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Affects Versions: 2.1.0
>Reporter: Philipp Hoffmann
>
> The `spark.mesos.constraints` configuration is ignored by the 
> spark-dispatcher. The constraints need to be passed in the Framework 
> information when registering with Mesos.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-19606) Support constraints in spark-dispatcher

2017-05-31 Thread Laurent Hoss (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16031146#comment-16031146
 ] 

Laurent Hoss edited comment on SPARK-19606 at 5/31/17 1:28 PM:
---

+1
hopefully this PR gets committer attention


was (Author: laurentcoder):
+1
hopefully this PR gets comitter attention

> Support constraints in spark-dispatcher
> ---
>
> Key: SPARK-19606
> URL: https://issues.apache.org/jira/browse/SPARK-19606
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Affects Versions: 2.1.0
>Reporter: Philipp Hoffmann
>
> The `spark.mesos.constraints` configuration is ignored by the 
> spark-dispatcher. The constraints need to be passed in the Framework 
> information when registering with Mesos.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15142) Spark Mesos dispatcher becomes unusable when the Mesos master restarts

2017-05-31 Thread Laurent Hoss (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16031143#comment-16031143
 ] 

Laurent Hoss commented on SPARK-15142:
--

+1 to get this solved !
(ps: We are calling the dispatcher with requests generated from a *livy* 
container..)

> Spark Mesos dispatcher becomes unusable when the Mesos master restarts
> --
>
> Key: SPARK-15142
> URL: https://issues.apache.org/jira/browse/SPARK-15142
> Project: Spark
>  Issue Type: Bug
>  Components: Deploy, Mesos
>Reporter: Devaraj K
>Priority: Minor
> Attachments: 
> spark-devaraj-org.apache.spark.deploy.mesos.MesosClusterDispatcher-1-stobdtserver5.out
>
>
> While Spark Mesos dispatcher running if the Mesos master gets restarted then 
> Spark Mesos dispatcher will keep running and queues up all the submitted 
> applications and will not launch them.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2984) FileNotFoundException on _temporary directory

2017-02-21 Thread Laurent Hoss (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15875962#comment-15875962
 ] 

Laurent Hoss commented on SPARK-2984:
-

related issue (still open): SPARK-10109

> FileNotFoundException on _temporary directory
> -
>
> Key: SPARK-2984
> URL: https://issues.apache.org/jira/browse/SPARK-2984
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.1.0
>Reporter: Andrew Ash
>Assignee: Josh Rosen
>Priority: Critical
> Fix For: 1.3.0
>
>
> We've seen several stacktraces and threads on the user mailing list where 
> people are having issues with a {{FileNotFoundException}} stemming from an 
> HDFS path containing {{_temporary}}.
> I ([~aash]) think this may be related to {{spark.speculation}}.  I think the 
> error condition might manifest in this circumstance:
> 1) task T starts on a executor E1
> 2) it takes a long time, so task T' is started on another executor E2
> 3) T finishes in E1 so moves its data from {{_temporary}} to the final 
> destination and deletes the {{_temporary}} directory during cleanup
> 4) T' finishes in E2 and attempts to move its data from {{_temporary}}, but 
> those files no longer exist!  exception
> Some samples:
> {noformat}
> 14/08/11 08:05:08 ERROR JobScheduler: Error running job streaming job 
> 140774430 ms.0
> java.io.FileNotFoundException: File 
> hdfs://hadoopc/user/csong/output/human_bot/-140774430.out/_temporary/0/task_201408110805__m_07
>  does not exist.
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:654)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:712)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:708)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:708)
> at 
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.mergePaths(FileOutputCommitter.java:360)
> at 
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.commitJob(FileOutputCommitter.java:310)
> at 
> org.apache.hadoop.mapred.FileOutputCommitter.commitJob(FileOutputCommitter.java:136)
> at 
> org.apache.spark.SparkHadoopWriter.commitJob(SparkHadoopWriter.scala:126)
> at 
> org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:841)
> at 
> org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:724)
> at 
> org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:643)
> at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1068)
> at 
> org.apache.spark.streaming.dstream.DStream$$anonfun$8.apply(DStream.scala:773)
> at 
> org.apache.spark.streaming.dstream.DStream$$anonfun$8.apply(DStream.scala:771)
> at 
> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:41)
> at 
> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
> at 
> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
> at scala.util.Try$.apply(Try.scala:161)
> at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32)
> at 
> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:172)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> -- Chen Song at 
> http://apache-spark-user-list.1001560.n3.nabble.com/saveAsTextFiles-file-not-found-exception-td10686.html
> {noformat}
> I am running a Spark Streaming job that uses saveAsTextFiles to save results 
> into hdfs files. However, it has an exception after 20 batches
> result-140631234/_temporary/0/task_201407251119__m_03 does not 
> exist.
> {noformat}
> and
> {noformat}
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
>  No lease on /apps/data/vddil/real-time/checkpoint/temp: File does not exist. 
> Holder DFSClient_NONMAPREDUCE_327993456_13 does not have any open files.
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2946)
>   at 
> 

[jira] [Commented] (SPARK-10109) NPE when saving Parquet To HDFS

2017-02-21 Thread Laurent Hoss (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15875960#comment-15875960
 ] 

Laurent Hoss commented on SPARK-10109:
--

seems related to SPARK-2984 (though closed, the last comment proposed to create 
a new issue)

> NPE when saving Parquet To HDFS
> ---
>
> Key: SPARK-10109
> URL: https://issues.apache.org/jira/browse/SPARK-10109
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.4.1
> Environment: Sparc-ec2, standalone cluster on amazon
>Reporter: Virgil Palanciuc
>
> Very simple code, trying to save a dataframe
> I get this in the driver
> {quote}
> 15/08/19 11:21:41 INFO TaskSetManager: Lost task 9.2 in stage 217.0 (TID 
> 4748) on executor 172.xx.xx.xx: java.lang.NullPointerException (null) 
> and  (not for that task):
> 15/08/19 11:21:46 WARN TaskSetManager: Lost task 5.0 in stage 543.0 (TID 
> 5607, 172.yy.yy.yy): java.lang.NullPointerException
> at 
> parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:146)
> at 
> parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:112)
> at 
> parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:73)
> at 
> org.apache.spark.sql.parquet.ParquetOutputWriter.close(newParquet.scala:88)
> at 
> org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536)
> at 
> org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536)
> at 
> scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107)
> at 
> scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107)
> at 
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
> at scala.collection.mutable.HashMap$$anon$2.foreach(HashMap.scala:107)
> at 
> org.apache.spark.sql.sources.DynamicPartitionWriterContainer.clearOutputWriters(commands.scala:536)
> at 
> org.apache.spark.sql.sources.DynamicPartitionWriterContainer.abortTask(commands.scala:552)
> at 
> org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.org$apache$spark$sql$sources$InsertIntoHadoopFsRelation$$writeRows$2(commands.scala:269)
> at 
> org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229)
> at 
> org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
> at org.apache.spark.scheduler.Task.run(Task.scala:70)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {quote}
> I get this in the executor log:
> {quote}
> 15/08/19 11:21:41 WARN DFSClient: DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
>  No lease on 
> /gglogs/2015-07-27/_temporary/_attempt_201508191119_0217_m_09_2/dpid=18432/pid=1109/part-r-9-46ac3a79-a95c-4d9c-a2f1-b3ee76f6a46c.snappy.parquet
>  File does not exist. Holder DFSClient_NONMAPREDUCE_1730998114_63 does not 
> have any open files.
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2396)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2387)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2183)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:481)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1695)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1691)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at 

[jira] [Commented] (SPARK-10643) Support HDFS application download in client mode spark submit

2016-10-07 Thread Laurent Hoss (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1581#comment-1581
 ] 

Laurent Hoss commented on SPARK-10643:
--

+1 
would be very useful when using Zeppelin (running in docker) on a *mesos* 
cluster
Unfort. the Zeppelin GUI does not either/yet support adding  jars from HDFS, 
nor does it support some kind of http-upload, instead only from local dirs (not 
practical when inside docker) or from a (custom) maven-repos (not ideal for 
quick dev iterations).

After I learned that it should work with 'cluster mode'  I tried to submit a 
spark job in cluster mode, within zeppelin
but it then failed because it can't find the builtin zeppelin-spark interpreter 
jar (when driver is ran in the 'cluster).
Not sure yet if that's actually an issue (as I'ld assume spark taking care to 
transfer the provided '--jars' ..)


> Support HDFS application download in client mode spark submit
> -
>
> Key: SPARK-10643
> URL: https://issues.apache.org/jira/browse/SPARK-10643
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Submit
>Reporter: Alan Braithwaite
>Priority: Minor
>
> When using mesos with docker and marathon, it would be nice to be able to 
> make spark-submit deployable on marathon and have that download a jar from 
> HDFS instead of having to package the jar with the docker.
> {code}
> $ docker run -it docker.example.com/spark:latest 
> /usr/local/spark/bin/spark-submit  --class 
> com.example.spark.streaming.EventHandler hdfs://hdfs/tmp/application.jar 
> Warning: Skip remote jar hdfs://hdfs/tmp/application.jar.
> java.lang.ClassNotFoundException: com.example.spark.streaming.EventHandler
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:348)
> at org.apache.spark.util.Utils$.classForName(Utils.scala:173)
> at 
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:639)
> at 
> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> {code}
> Although I'm aware that we can run in cluster mode with mesos, we've already 
> built some nice tools surrounding marathon for logging and monitoring.
> Code in question:
> https://github.com/apache/spark/blob/132718ad7f387e1002b708b19e471d9cd907e105/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L723-L736



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12430) Temporary folders do not get deleted after Task completes causing problems with disk space.

2016-08-17 Thread Laurent Hoss (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15424242#comment-15424242
 ] 

Laurent Hoss commented on SPARK-12430:
--

In spark 2.0 this should be less an issue, at least when using 'coarse grained' 
mode after SPARK-12330 was fixed.
[~fede-bis] any feedback if this has been resolved for you ?

[~dragos] said > Are you using 1.6? In that case, the blockmgr directory should 
really be inside your Mesos sandbox, not under /tmp. At least, that's what I 
see when I try out.

Let me note that one big drawback having those dirs in the sandbox (/tmp)  is 
that the shuffling cannot be parallized over multiple disks.
That's why we prefer to set `spark.local.dirs` to use a number of partitions 
(on different disks), and ensure those get cleaned regularly using some 'find 
-mtime' magic on the blockmgr-* dirs (non ideal hower, so we hope the migration 
to spark-2 improves the situation)


> Temporary folders do not get deleted after Task completes causing problems 
> with disk space.
> ---
>
> Key: SPARK-12430
> URL: https://issues.apache.org/jira/browse/SPARK-12430
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.5.1, 1.5.2, 1.6.0
> Environment: Ubuntu server
>Reporter: Fede Bar
>
> We are experiencing an issue with automatic /tmp folder deletion after 
> framework completes. Completing a M/R job using Spark 1.5.2 (same behavior as 
> Spark 1.5.1) over Mesos will not delete some temporary folders causing free 
> disk space on server to exhaust. 
> Behavior of M/R job using Spark 1.4.1 over Mesos cluster:
> - Launched using spark-submit on one cluster node.
> - Following folders are created: */tmp/mesos/slaves/id#* , */tmp/spark-#/*  , 
>  */tmp/spark-#/blockmgr-#*
> - When task is completed */tmp/spark-#/* gets deleted along with 
> */tmp/spark-#/blockmgr-#* sub-folder.
> Behavior of M/R job using Spark 1.5.2 over Mesos cluster (same identical job):
> - Launched using spark-submit on one cluster node.
> - Following folders are created: */tmp/mesos/mesos/slaves/id** * , 
> */tmp/spark-***/ *  ,{color:red} /tmp/blockmgr-***{color}
> - When task is completed */tmp/spark-***/ * gets deleted but NOT shuffle 
> container folder {color:red} /tmp/blockmgr-***{color}
> Unfortunately, {color:red} /tmp/blockmgr-***{color} can account for several 
> GB depending on the job that ran. Over time this causes disk space to become 
> full with consequences that we all know. 
> Running a shell script would probably work but it is difficult to identify 
> folders in use by a running M/R or stale folders. I did notice similar issues 
> opened by other users marked as "resolved", but none seems to exactly match 
> the above behavior. 
> I really hope someone has insights on how to fix it.
> Thank you very much!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16522) [MESOS] Spark application throws exception on exit

2016-08-05 Thread Laurent Hoss (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409502#comment-15409502
 ] 

Laurent Hoss commented on SPARK-16522:
--

Is this gonna be fixed soon (2.0.1) ?!
I just got a stacktrace of a developer here, who's job *failed* .. with 
following exception
{code}
2016-08-05 05:43:35,186 WARN  org.apache.spark.rpc.netty.NettyRpcEndpointRef: 
Error sending message [message = RemoveExecutor(5,Executor finished with state 
FINISHED)] in 1 attempts
org.apache.spark.SparkException: Exception thrown in awaitResult
at 
org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
at 
org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
at 
scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at 
org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at 
org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
at 
org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102)
at 
org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:78)
at 
org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.removeExecutor(CoarseGrainedSchedulerBackend.scala:412)
at 
org.apache.spark.scheduler.cluster.mesos.MesosCoarseGrainedSchedulerBackend.executorTerminated(MesosCoarseGrainedSchedulerBackend.scala:555)
at 
org.apache.spark.scheduler.cluster.mesos.MesosCoarseGrainedSchedulerBackend.statusUpdate(MesosCoarseGrainedSchedulerBackend.scala:495)
Caused by: org.apache.spark.SparkException: Could not find 
CoarseGrainedScheduler.
at 
org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:152)
at 
org.apache.spark.rpc.netty.Dispatcher.postLocalMessage(Dispatcher.scala:127)
at org.apache.spark.rpc.netty.NettyRpcEnv.ask(NettyRpcEnv.scala:225)
at 
org.apache.spark.rpc.netty.NettyRpcEndpointRef.ask(NettyRpcEnv.scala:508)
at 
org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
... 4 more
{code}


> [MESOS] Spark application throws exception on exit
> --
>
> Key: SPARK-16522
> URL: https://issues.apache.org/jira/browse/SPARK-16522
> Project: Spark
>  Issue Type: Bug
>  Components: Mesos
>Affects Versions: 2.0.0
>Reporter: Sun Rui
>
> Spark applications running on Mesos throw exception upon exit as follows:
> {noformat}
> 16/07/13 15:20:46 WARN NettyRpcEndpointRef: Error sending message [message = 
> RemoveExecutor(1,Executor finished with state FINISHED)] in 3 attempts
> org.apache.spark.SparkException: Exception thrown in awaitResult
>   at 
> org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
>   at 
> org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
>   at 
> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
>   at 
> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
>   at 
> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
>   at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
>   at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
>   at 
> org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102)
>   at 
> org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:78)
>   at 
> org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.removeExecutor(CoarseGrainedSchedulerBackend.scala:412)
>   at 
> org.apache.spark.scheduler.cluster.mesos.MesosCoarseGrainedSchedulerBackend.executorTerminated(MesosCoarseGrainedSchedulerBackend.scala:555)
>   at 
> org.apache.spark.scheduler.cluster.mesos.MesosCoarseGrainedSchedulerBackend.statusUpdate(MesosCoarseGrainedSchedulerBackend.scala:495)
> Caused by: org.apache.spark.SparkException: Could not find 
> CoarseGrainedScheduler.
>   at 
> org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:152)
>   at 
> org.apache.spark.rpc.netty.Dispatcher.postLocalMessage(Dispatcher.scala:127)
>   at org.apache.spark.rpc.netty.NettyRpcEnv.ask(NettyRpcEnv.scala:225)
>   at 
> org.apache.spark.rpc.netty.NettyRpcEndpointRef.ask(NettyRpcEnv.scala:508)
>   at 
> org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
>   ... 4 more
> Exception in thread "Thread-47" org.apache.spark.SparkException: Error 
> notifying standalone scheduler's driver endpoint
>