GitHub user holdenk opened a pull request:
https://github.com/apache/spark/pull/19045
[WIP][SPARK-20628][CORE] Keep track of nodes (/ spot instances) which are
going to be shutdown
## What changes were proposed in this pull request?
Keep track of nodes which are going to be shutdown to prevent schedualing
tasks. The PR is designed with spot instances in mind, where there is some
notice (depending on the cloud vendor) that the node will be shut down.
Since each vendor notifies instances of pending termination in different
manner, it is left to the instance to notify the worker(s) of decommissioning
with SIGPWR.
SPARK-20628 is a sub-task of SPARK-20624 with follow up tasks to perform
migration of data and re-launching of tasks. SPARK-20628 is distinct from other
mechanism where Spark its self has control of executor decommissioning, however
the later follow up tasks in SPARK-20624 should be usable across voluntary and
involuntary termination (e.g. https://github.com/apache/spark/pull/19041 could
provide a good mechanism for doing data copy during involuntary termination).
## How was this patch tested?
Extension of AppClientSuite to cover decommissioning and addition of
explicit worker decom suite.
TODO: Deploy on live EC2 cluster with companion monitoring script and wait
for spot instance prices to spike and confirm decommissioning.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/holdenk/spark
SPARK-20628-keep-track-of-nodes-which-are-going-to-be-shutdown-r2
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19045.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19045
----
commit 81fff20471bd2aded08380c8dd99c09fe34d2c79
Author: Holden Karau <[email protected]>
Date: 2017-05-12T14:07:40Z
Start of work on adventures
commit e470bac53151418d02dd5f03f243d635900376a9
Author: Holden Karau <[email protected]>
Date: 2017-06-02T12:16:39Z
Mini progresss
commit a00c707cd707c6ca2003c4d53ee51735dda3a96e
Author: Holden Karau <[email protected]>
Date: 2017-06-02T13:29:41Z
Go down the path of handling as lost but urgh lets just blacklist instead
maybe
commit 74ade447ec94b600f5447a9269e66e47ae78fb11
Author: Holden Karau <[email protected]>
Date: 2017-06-09T18:59:26Z
Plumb through executor loss to the scheduables
commit a880177f9bf45a2f0644229fbff863f80d058161
Author: Holden Karau <[email protected]>
Date: 2017-06-21T13:00:03Z
AppClient suite works! yay
commit b9704038e96b0bb862b824cb9723e68633e18c06
Author: Holden Karau <[email protected]>
Date: 2017-06-21T17:04:53Z
Decomissioning now works in the coarse grained scheduler, yay....
commit ded6bbc8d056f9f82302450aa27dbc0d94fdbccd
Author: Holden Karau <[email protected]>
Date: 2017-06-22T16:10:06Z
Remove sketchy println debugging
commit 16c855ad9eb7b961d805bc2f459d86f3b3d31108
Author: Holden Karau <[email protected]>
Date: 2017-07-06T06:13:41Z
Add a worker decommissioning suite
commit c79a06d0d38c53877c9d1b607ba95bb7b89f1e44
Author: Holden Karau <[email protected]>
Date: 2017-07-06T23:41:34Z
Merge in latest master
commit e3798d0f462659ca4ebb4ba660c8f00aa023c380
Author: Holden Karau <[email protected]>
Date: 2017-08-16T18:57:00Z
Merge branch 'master' into
SPARK-20628-keep-track-of-nodes-which-are-going-to-be-shutdown-r2
commit 4f70706847a4d04b78e19e0eaa50035e9721e7f0
Author: Holden Karau <[email protected]>
Date: 2017-08-17T18:32:56Z
Merge branch 'master' into
SPARK-20628-keep-track-of-nodes-which-are-going-to-be-shutdown-r2
commit 07c3e3e01516f43f67ad67cc55581197008c7556
Author: Holden Karau <[email protected]>
Date: 2017-08-22T19:10:01Z
Merge branch 'master' into
SPARK-20628-keep-track-of-nodes-which-are-going-to-be-shutdown-r2
commit c2a0ad87dc3220eb5154a6d0a117ce0260bd2695
Author: Holden Karau <[email protected]>
Date: 2017-08-22T20:28:24Z
Add decommissioning script for whatever process is running locally on host
to call
commit 672c3b6f79400cce867ce273199ccdcf995b6ed6
Author: Holden Karau <[email protected]>
Date: 2017-08-22T21:38:51Z
Leave polling mechanism up to the cloud vendors
commit 9cfdb7fc36691bf0c627080de5c2008fe83ba3bd
Author: Holden Karau <[email protected]>
Date: 2017-08-22T21:55:12Z
Remove legacy comment and remove some unecessary blank lines
commit 65a29c12c1740c285ff7b06f3788cd2a92ce87f1
Author: Holden Karau <[email protected]>
Date: 2017-08-22T21:59:24Z
Remove manually debugging printlns (oops)
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]