viirya commented on a change in pull request #28245: [SPARK-31472][CORE] Make
sure Barrier Task always return messages or exception with abortableRpcFuture
check
URL: https://github.com/apache/spark/pull/28245#discussion_r410977586
##########
File path: core/src/main/scala/org/apache/spark/BarrierTaskContext.scala
##########
@@ -85,28 +84,28 @@ class BarrierTaskContext private[spark] (
// BarrierCoordinator on timeout, instead of RPCTimeoutException from
the RPC framework.
timeout = new RpcTimeout(365.days, "barrierTimeout"))
- // messages which consist of all barrier tasks' messages
- var messages: Array[String] = null
// Wait the RPC future to be completed, but every 1 second it will jump
out waiting
// and check whether current spark task is killed. If killed, then throw
// a `TaskKilledException`, otherwise continue wait RPC until it
completes.
- try {
- while (!abortableRpcFuture.toFuture.isCompleted) {
+
+ // import scala Success locally to avoid conflict with
org.apache.spark.Success
+ import scala.util.{Failure, Success, Try}
+ while (!abortableRpcFuture.future.isCompleted) {
+ try {
// wait RPC future for at most 1 second
- try {
- messages = ThreadUtils.awaitResult(abortableRpcFuture.toFuture,
1.second)
- } catch {
- case _: TimeoutException | _: InterruptedException =>
- // If `TimeoutException` thrown, waiting RPC future reach 1
second.
- // If `InterruptedException` thrown, it is possible this task is
killed.
- // So in this two cases, we should check whether task is killed
and then
- // throw `TaskKilledException`
- taskContext.killTaskIfInterrupted()
+ Thread.sleep(1000)
+ } catch {
+ case _: InterruptedException => // task is killed by driver
+ } finally {
+ Try(taskContext.killTaskIfInterrupted()) match {
+ case Success(_) => // task is still running healthily
+ case Failure(e) => abortableRpcFuture.abort(e)
}
}
- } finally {
-
abortableRpcFuture.abort(taskContext.getKillReason().getOrElse("Unknown
reason."))
Review comment:
Just out of curiosity, why previously we did `abortableRpcFuture.abort` even
the RPC future is completely normally?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]