Ngone51 commented on pull request #31298: URL: https://github.com/apache/spark/pull/31298#issuecomment-766596881
``` scala [info] org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 3.0 failed 4 times, most recent failure: Lost task 1.3 in stage 3.0 (TID 8, 192.168.1.57, executor 1): java.io.IOException: org.apache.spark.storage.BlockSavedOnDecommissionedBlockManagerException: Block broadcast_2_piece0 cannot be saved on decommissioned executor [info] org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 3.0 failed 4 times, most recent failure: Lost task 1.3 in stage 3.0 (TID 8, 192.168.1.57, executor 1): java.io.IOException: org.apache.spark.storage.BlockSavedOnDecommissionedBlockManagerException: Block broadcast_2_piece0 cannot be saved on decommissioned executor[info] at ... ``` Usually, the task fails with the max failure times would due to the same reason. But in the case of `BlockSavedOnDecommissionedBlockManagerException`, I'd not expect the 4 times failures are all caused by `BlockSavedOnDecommissionedBlockManagerException` since we don't schedule tasks on the decommissioning executor. (Unless all the task failures always happen before the decommission notice arrives at the driver. But this seems very impossible.) So to ensure that we still ensure that we don't schedule tasks on the decommissioning executor (or there's really a race condition or it's just a coincidence), I'd like to know the previous task failures for this task (TID = 8). e.g., what are failures reasons for previous attempts? And which executors are scheduled? BTW, the change itself lgtm. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org