Ngone51 commented on a change in pull request #28245:
URL: https://github.com/apache/spark/pull/28245#discussion_r411045901
##########
File path: core/src/main/scala/org/apache/spark/BarrierTaskContext.scala
##########
@@ -85,28 +84,28 @@ class BarrierTaskContext private[spark] (
// BarrierCoordinator on timeout, instead of RPCTimeoutException from
the RPC framework.
timeout = new RpcTimeout(365.days, "barrierTimeout"))
- // messages which consist of all barrier tasks' messages
- var messages: Array[String] = null
// Wait the RPC future to be completed, but every 1 second it will jump
out waiting
// and check whether current spark task is killed. If killed, then throw
// a `TaskKilledException`, otherwise continue wait RPC until it
completes.
- try {
- while (!abortableRpcFuture.toFuture.isCompleted) {
+
+ // import scala Success locally to avoid conflict with
org.apache.spark.Success
+ import scala.util.{Failure, Success, Try}
+ while (!abortableRpcFuture.future.isCompleted) {
+ try {
// wait RPC future for at most 1 second
- try {
- messages = ThreadUtils.awaitResult(abortableRpcFuture.toFuture,
1.second)
- } catch {
- case _: TimeoutException | _: InterruptedException =>
- // If `TimeoutException` thrown, waiting RPC future reach 1
second.
- // If `InterruptedException` thrown, it is possible this task is
killed.
- // So in this two cases, we should check whether task is killed
and then
- // throw `TaskKilledException`
- taskContext.killTaskIfInterrupted()
+ Thread.sleep(1000)
+ } catch {
+ case _: InterruptedException => // task is killed by driver
+ } finally {
+ Try(taskContext.killTaskIfInterrupted()) match {
+ case Success(_) => // task is still running healthily
+ case Failure(e) => abortableRpcFuture.abort(e)
}
}
- } finally {
-
abortableRpcFuture.abort(taskContext.getKillReason().getOrElse("Unknown
reason."))
Review comment:
What `abort` actually does underlying is `promise.tryFailure(e)`. So it
actually does not take effect if the RPC future has already completed. And
(previously), it only takes effect when exception from
`taskContext.killTaskIfInterrupted()` comes first.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]