GitHub user jerryshao opened a pull request:
https://github.com/apache/spark/pull/9963
[SPARK-10582][Yarn][Core] Fix AM failure situation for dynamic allocation
Because of AM failure, the target executor number between driver and AM
will be different, which will lead to unexpected behavior in dynamic
allocation. So when AM is re-registered with driver, state in
`ExecutorAllocationManager` and `CoarseGrainedSchedulerBacked` should be reset.
This issue is originally addressed in #8737 , here re-opened again. Thanks
a lot @KaiXinXiaoLei for finding this issue.
@andrewor14 and @vanzin would you please help to review this, thanks a lot.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/jerryshao/apache-spark SPARK-10582
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/9963.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #9963
----
commit c272c7eb005bf443678f2cd89c6971a3f022edbd
Author: jerryshao <[email protected]>
Date: 2015-11-24T09:08:00Z
Fix AM failure situation for dynamic allocation
commit 1f92d27f525500a907d1862b47eea156ff2aff85
Author: jerryshao <[email protected]>
Date: 2015-11-25T06:15:05Z
Remove unnecessary code
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]