WeichenXu123 commented on a change in pull request #25235: [SPARK-28483][Core]
Fix canceling a spark job using barrier mode but barrier tasks blocking on
BarrierTaskContext.barrier()
URL: https://github.com/apache/spark/pull/25235#discussion_r309965137
##########
File path: core/src/main/scala/org/apache/spark/rpc/RpcEndpointRef.scala
##########
@@ -46,6 +46,17 @@ private[spark] abstract class RpcEndpointRef(conf:
SparkConf)
*/
def send(message: Any): Unit
+ /**
+ * Send a message to the corresponding [[RpcEndpoint.receiveAndReply)]] and
return a [[Future]] to
+ * receive the reply within the specified timeout.
+ * Return a `CancelableFuture` instance which wrap `Future` but with
additional `cancel` method.
+ *
+ * This method only sends the message once and never retries.
+ */
+ def askCancelable[T: ClassTag](message: Any, timeout: RpcTimeout):
CancelableFuture[T] = {
Review comment:
@zsxwing
Although we cannot withdraw message sent out, I hope to do similar "cleanup"
stuff like what "RPC timeout" will do. See code here
https://github.com/apache/spark/blob/2eda1876337e65915f03464076a772ea809bd361/core/src/main/scala/org/apache/spark/rpc/netty/NettyRpcEnv.scala#L248
The `rpcMessage.onCancel()` will do the same thing with
`rpcMessage.onTimeout()`
If we do not add `cancel` interface, we have no way to trigger the code
`rpcMessage.onCancel()` because we cannot access the `rpcMessage` instance from
the `Future` instance.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]