cloud-fan commented on a change in pull request #31119:
URL: https://github.com/apache/spark/pull/31119#discussion_r557936604



##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala
##########
@@ -74,7 +74,10 @@ case class BroadcastExchangeExec(
     child: SparkPlan) extends BroadcastExchangeLike {
   import BroadcastExchangeExec._
 
-  override val runId: UUID = UUID.randomUUID
+  // Cancelling a SQL statement from Spark ThriftServer needs to cancel
+  // its related broadcast sub-jobs. So set the run id to job group id if 
exists.
+  override val runId: UUID = 
Option(sparkContext.getLocalProperty(SparkContext.SPARK_JOB_GROUP_ID))

Review comment:
       After a second thought, I think this is risky. It's possible that in a 
non-STS environment, users set job group id manually, and run some long-running 
jobs. If we capture the job group id here in broadcast exchange, when the 
broadcast timeout, it will cancel the whole job group which may kill the user's 
other long-running jobs unexpectedly.
   
   I think we need to revisit the STS's SQL statement canceling feature. We 
should use SQL execution ID to find out all the jobs of a SQL query, and assign 
a unique job group id to them.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to