GitHub user vanzin opened a pull request:
https://github.com/apache/spark/pull/2020
[SPARK-2933] [yarn] Refactor and cleanup Yarn AM code.
This change modifies the Yarn module so that all the logic related
to running the ApplicationMaster is localized. Instead of, previously,
4 different classes with mostly identical code, now we have:
- A single, shared ApplicationMaster class, which can operate both in
client and cluster mode, and substitutes the old ApplicationMaster
(for cluster mode) and ExecutorLauncher (for client mode).
The benefit here is that all different execution modes for all supported
yarn versions use the same shared code for monitoring executor allocation,
setting up configuration, and monitoring the process's lifecycle.
- A new YarnRMClient interface, which defines basic RM functionality needed
by the ApplicationMaster. This interface has concrete implementations for
each supported Yarn version.
- A new YarnAllocator interface, which just abstracts the existing interface
of the YarnAllocationHandler class. This is to avoid having to touch the
allocator code too much in this change, although it might benefit from a
similar effort in the future.
The end result is much easier to understand code, with much less
duplication,
making it much easier to fix bugs, add features, and test everything knowing
that all supported versions will behave the same.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/vanzin/spark SPARK-2933
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/2020.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2020
----
commit 3630f1e8d02ad4d77ce56390b52bad8bbbcd4691
Author: Marcelo Vanzin <[email protected]>
Date: 2014-08-08T20:51:25Z
[SPARK-2933] [yarn] Refactor and cleanup Yarn AM code.
This change modifies the Yarn module so that all the logic related
to running the ApplicationMaster is localized. Instead of, previously,
4 different classes with mostly identical code, now we have:
- A single, shared ApplicationMaster class, which can operate both in
client and cluster mode, and substitutes the old ApplicationMaster
(for cluster mode) and ExecutorLauncher (for client mode).
The benefit here is that all different execution modes for all supported
yarn versions use the same shared code for monitoring executor allocation,
setting up configuration, and monitoring the process's lifecycle.
- A new YarnRMClient interface, which defines basic RM functionality needed
by the ApplicationMaster. This interface has concrete implementations for
each supported Yarn version.
- A new YarnAllocator interface, which just abstracts the existing interface
of the YarnAllocationHandler class. This is to avoid having to touch the
allocator code too much in this change, although it might benefit from a
similar effort in the future.
The end result is much easier to understand code, with much less
duplication,
making it much easier to fix bugs, add features, and test everything knowing
that all supported versions will behave the same.
commit c00be0d5f31fed394ccb41e62fc8aeb614a57e0b
Author: Marcelo Vanzin <[email protected]>
Date: 2014-08-14T22:28:22Z
Changes to the yarn-alpha project to use common AM code.
Made some tweaks to the YarnAllocator interface to cover both
APIs more easily. There's still a lot of cleanup possible on
that front, but I'll leave that as a separate task.
commit cec283a831b856f505f1c20a7e4e7c869808b641
Author: Marcelo Vanzin <[email protected]>
Date: 2014-08-14T23:15:46Z
Trivial cleanups.
commit a694d0823f383a68d58813f6e33bcaebb87882fe
Author: Marcelo Vanzin <[email protected]>
Date: 2014-08-15T00:43:33Z
Fix UI filter registration.
commit e73d00e0415a31d3785964b958c101e49e466406
Author: Marcelo Vanzin <[email protected]>
Date: 2014-08-15T01:45:44Z
Keep "ExecutorLauncher" as the main class for client-mode AM.
commit fd699fd9449e0e7c6a4ec06b4260bb11c093efe2
Author: Marcelo Vanzin <[email protected]>
Date: 2014-08-15T02:09:37Z
Finish app if SparkContext initialization times out.
This avoids the NPEs that would happen if code just kept going.
commit 6145a986564e12afa028c066dde4042057d74447
Author: Marcelo Vanzin <[email protected]>
Date: 2014-08-15T02:20:19Z
Fix some questionable error handling.
commit f3eb8dc4ed8ba33f2b24b0211067952af3c2cfbb
Author: Marcelo Vanzin <[email protected]>
Date: 2014-08-15T02:30:17Z
More trivial cleanup.
commit ccee155a81c5f6627c45f11b5c36f0b90b00dfac
Author: Marcelo Vanzin <[email protected]>
Date: 2014-08-15T02:39:01Z
Move cluster/client code to separate methods.
Makes code a little cleaner and easier to follow.
commit 6a7b07a11ca5aab21bfb4e245e514e9b5f7925b9
Author: Marcelo Vanzin <[email protected]>
Date: 2014-08-18T18:02:04Z
Some more cleanup.
commit 30968c0f61ec3c49e293f1529973c4e1db5b93b6
Author: Marcelo Vanzin <[email protected]>
Date: 2014-08-18T18:17:30Z
Restore shutdown hook to clean up staging dir.
commit 5100474aa46627e345951977d48e178ea793850f
Author: Marcelo Vanzin <[email protected]>
Date: 2014-08-18T18:26:09Z
Cleanup a couple more constants.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]