WeichenXu123 commented on a change in pull request #32399:
URL: https://github.com/apache/spark/pull/32399#discussion_r629087709
##########
File path:
mllib/src/main/scala/org/apache/spark/ml/tuning/TrainValidationSplit.scala
##########
@@ -161,11 +169,26 @@ class TrainValidationSplit @Since("1.5.0")
(@Since("1.5.0") override val uid: St
}
// Wait for all metrics to be calculated
- val metrics = metricFutures.map(ThreadUtils.awaitResult(_, Duration.Inf))
-
- // Unpersist training & validation set once all metrics have been produced
- trainingDataset.unpersist()
- validationDataset.unpersist()
+ val metrics = try {
+ metricFutures.map(ThreadUtils.awaitResult(_, Duration.Inf))
+ }
+ catch {
+ case e: Throwable =>
+ subTaskFailed = true
+ throw e
+ }
+ finally {
+ if (subTaskFailed) {
+ Thread.sleep(1000)
Review comment:
This sleep is for:
each trial task which thread already running, may took some time running
before it launch spark job, if here we cancel job immediately, then we may miss
killing the spark job which will be spawned soon
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]