GitHub user gaborgsomogyi opened a pull request:
https://github.com/apache/spark/pull/21214
[SPARK-23775][TEST] Make DataFrameRangeSuite not flaky
## What changes were proposed in this pull request?
DataFrameRangeSuite.test("Cancelling stage in a query with Range.") stays
sometimes in an infinite loop and times out the build.
There were multiple issues with the test:
1. The first valid stageId is zero when the test started alone and not in a
suite and the following code waits until timeout:
```
eventually(timeout(10.seconds), interval(1.millis)) {
assert(DataFrameRangeSuite.stageToKill > 0)
}
```
2. The `DataFrameRangeSuite.stageToKill` was overwritten by the task's
thread after the reset which ended up in canceling the same stage 2 times. This
caused the infinite wait.
This PR solves this mentioned flakyness by removing the shared
`DataFrameRangeSuite.stageToKill` and using `onTaskStart` where stage ID is
provided. In order to make sure cancelStage called for all stages
`waitUntilEmpty` is called on `ListenerBus`.
In [PR20888](https://github.com/apache/spark/pull/20888) this tried to get
solved by:
* Stopping the executor thread with `wait`
* Wait for all `cancelStage` called
* Kill the executor thread by setting
`SparkContext.SPARK_JOB_INTERRUPT_ON_CANCEL`
but this thread killing left the shared `SparkContext` sometimes in a state
where further tasks can't be submitted. As a result
DataFrameRangeSuite.test("Cancelling stage in a query with Range.") test passed
properly but the next test inside the suite was hanging.
## How was this patch tested?
Existing unit test executed 10k times.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/gaborgsomogyi/spark SPARK-23775_1
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/21214.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #21214
----
commit 9781cbee95f338d5e1bcd61190c7a938155803bf
Author: Gabor Somogyi <gabor.g.somogyi@...>
Date: 2018-05-02T09:23:38Z
[SPARK-23775][TEST] Make DataFrameRangeSuite not flaky
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]