We are running pyspark against our cluster in coarse-grained mode by
specifying the --master mesos://host:5050 flag, which properly creates one
task on each node.

However, if the driver is shut down, it appears that these executors become
orphaned_tasks, still consuming resources on the slave, but no longer being
represented in the master's understanding of available resources.

Examining the stdout/stderr shows it exited:

Registered executor on node4
Starting task 0
sh -c 'cd spark-1*;  ./bin/spark-class
org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://
[email protected]:41563 --executor-id
aa1337b6-43b0-4236-b445-c8ccbfb60506-S2/0 --hostname node4 --cores 31
--app-id aa1337b6-43b0-4236-b445-c8ccbfb60506-0097'
Forked command at 117620
Command exited with status 1 (pid: 117620)

But, these executors are remaining on all the slaves.

What can we do to clear them out? Stopping mesos-slave and removing the
full work-dir is successful, but also destroys our other tasks.

Thanks,
June Taylor
System Administrator, Minnesota Population Center
University of Minnesota

Reply via email to