Re: How to stop a running job

Mark Hamstra Wed, 05 Oct 2016 14:25:17 -0700

Yes and no.  Something that you need to be aware of is that a Job as such
exists in the DAGScheduler as part of the Application running on the
Driver.  When talking about stopping or killing a Job, however, what people
often mean is not just stopping the DAGScheduler from telling the Executors
to run more Tasks associated with the Job, but also to stop any associated
Tasks that are already running on Executors.  That is something that Spark
doesn't try to do by default, and changing that behavior has been an open
issue for a long time -- cf. SPARK-17064


On Wed, Oct 5, 2016 at 2:07 PM, Michael Gummelt <mgumm...@mesosphere.io>
wrote:

> If running in client mode, just kill the job.  If running in cluster mode,
> the Spark Dispatcher exposes an HTTP API for killing jobs.  I don't think
> this is externally documented, so you might have to check the code to find
> this endpoint.  If you run in dcos, you can just run "dcos spark kill <id>".
>
> You can also find which node is running the driver, ssh in, and kill the
> process.
>
> On Wed, Oct 5, 2016 at 1:55 PM, Richard Siebeling <rsiebel...@gmail.com>
> wrote:
>
>> Hi,
>>
>> how can I stop a long running job?
>>
>> We're having Spark running in Mesos Coarse-grained mode. Suppose the user
>> start a long running job, makes a mistake, changes a transformation and
>> runs the job again. In this case I'd like to cancel the first job and after
>> that start the second job. It would be a waste of resources to finish the
>> first job (which could possibly take several hours...)
>>
>> How can this be accomplished?
>> thanks in advance,
>> Richard
>>
>>
>
>
> --
> Michael Gummelt
> Software Engineer
> Mesosphere
>

Re: How to stop a running job

Reply via email to