GitHub user patrickbrownsync opened a pull request:
https://github.com/apache/spark/pull/22883
[SPARK-25837] [Core] Fix potential slowdown in AppStatusListener when
cleaning up stages
## What changes were proposed in this pull request?
* Update `AppStatusListener` `cleanupStages` method to remove tasks for
those stages in a single pass instead of 1 for each stage.
* This fixes an issue where the cleanupStages method would get backed up,
causing a backup in the executor in ElementTrackingStore, resulting in stages
and jobs not getting cleaned up properly.
Tasks seem most susceptible to this as there are a lot of them, however a
similar issue could arise in other locations the `KVStore` `view` method is
used. A broader fix might involve updates to `KVStoreView` and `InMemoryView`
as it appears this interface and implementation can lead to multiple and
inefficient traversals of the stored data.
## How was this patch tested?
Using existing tests in AppStatusListenerSuite
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/Blyncs/spark cleanup-stages-fix
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22883.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22883
----
commit 4bfdac025ff0906316cf7697933a7b374ae3b427
Author: Patrick Brown <patrick.brown@...>
Date: 2018-10-29T19:49:50Z
Update cleanupStages in AppStatusListener to delete tasks for all stages in
a single pass
commit 178f7c3bf82f93177fce086037ece6ebf09bb350
Author: Patrick Brown <patrick.brown@...>
Date: 2018-10-29T19:55:38Z
remove uneeded type
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]