Github user hhbyyh commented on a diff in the pull request:
https://github.com/apache/spark/pull/16158#discussion_r159016507
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala ---
@@ -133,7 +134,10 @@ class CrossValidator @Since("1.2.0") (@Since("1.4.0")
override val uid: String)
logInfo(s"Best cross-validation metric: $bestMetric.")
val bestModel = est.fit(dataset, epm(bestIndex)).asInstanceOf[Model[_]]
instr.logSuccess(bestModel)
- copyValues(new CrossValidatorModel(uid, bestModel,
metrics).setParent(this))
+ val model = new CrossValidatorModel(uid, bestModel,
metrics).setParent(this)
+ val summary = new TuningSummary(epm, metrics, bestIndex)
+ model.setSummary(Some(summary))
--- End diff --
The latest implementation does not need to save the extra dataframe. Since
basically the dataframe can be generated from $(estimatorParamMaps) and
avgMetrics.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]