Github user vanzin commented on a diff in the pull request:
    --- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameRangeSuite.scala ---
    @@ -152,22 +154,28 @@ class DataFrameRangeSuite extends QueryTest with 
SharedSQLContext with Eventuall
       test("Cancelling stage in a query with Range.") {
    +    val slices = 10
         val listener = new SparkListener {
    -      override def onJobStart(jobStart: SparkListenerJobStart): Unit = {
    -        eventually(timeout(10.seconds), interval(1.millis)) {
    -          assert(DataFrameRangeSuite.stageToKill > 0)
    +      override def onTaskStart(taskStart: SparkListenerTaskStart): Unit = {
    +        eventually(timeout(10.seconds)) {
    +          assert(DataFrameRangeSuite.isTaskStarted)
    -        sparkContext.cancelStage(DataFrameRangeSuite.stageToKill)
    +        sparkContext.cancelStage(taskStart.stageId)
    +        DataFrameRangeSuite.semaphore.release(slices)
    --- End diff --
    I know this is what the other test does, but after looking at this again I 
think this is still dangerous.
    `sparkContext.cancelStage` is asynchronous, so you might wake up all the 
tasks and they might finish before the cancellation actually happens. If you 
just remove this line, then you guarantee the tasks won't finish unless the 
state is canceled.
    Which means you don't really need the semaphore, you can just sleep 
indefinitely in the tasks (e.g. call `wait()`). And instead of the `eventually` 
above you could use a `CountDownLatch`.
    And if you think of it, you don't need the `DataFrameSuite` object since 
everything here is local to this test.
    All this also applies to the other test, so if you feel like cleaning up 
that one in a separate PR...


To unsubscribe, e-mail:
For additional commands, e-mail:

Reply via email to