Github user gaborgsomogyi commented on a diff in the pull request:
https://github.com/apache/spark/pull/21214#discussion_r185792166
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/DataFrameRangeSuite.scala ---
@@ -153,23 +153,17 @@ class DataFrameRangeSuite extends QueryTest with
SharedSQLContext with Eventuall
test("Cancelling stage in a query with Range.") {
val listener = new SparkListener {
- override def onJobStart(jobStart: SparkListenerJobStart): Unit = {
- eventually(timeout(10.seconds), interval(1.millis)) {
- assert(DataFrameRangeSuite.stageToKill > 0)
- }
- sparkContext.cancelStage(DataFrameRangeSuite.stageToKill)
+ override def onTaskStart(taskStart: SparkListenerTaskStart): Unit = {
+ sparkContext.cancelStage(taskStart.stageId)
}
}
sparkContext.addSparkListener(listener)
for (codegen <- Seq(true, false)) {
withSQLConf(SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key ->
codegen.toString()) {
- DataFrameRangeSuite.stageToKill = -1
val ex = intercept[SparkException] {
- spark.range(0, 100000000000L, 1, 1).map { x =>
- DataFrameRangeSuite.stageToKill = TaskContext.get().stageId()
- x
- }.toDF("id").agg(sum("id")).collect()
+ spark.range(0, 100000000000L, 1, 1)
--- End diff --
Yeah, after extensive testing and tryout I was thinking about this as well
but hesitating to stop the executor thread any kind of way. The reasons behind:
- Executor code can be synchronized only with object variables, otherwise
`NotSerializableException` comes. As a result of introducing object
`CountDownLatch ` similar race will appear just like
`DataFrameRangeSuite.stageToKill` was overwritten.
- `spark.range(0, 100000000000L, 1, 1)` takes ~1-2 minutes to get
calculated which is quite a big time window not to be flaky this way.
- Minor concern(rather note) if no timeout syncronization way used the test
code get stuck in `collect` and never reaches `waitUntilEmpty` and
`eventually...`.
All in all not recommend to do it because it could end up in 5 hours
timeout.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]