Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/19350#discussion_r157118663
--- Diff: mllib/src/main/scala/org/apache/spark/ml/Estimator.scala ---
@@ -82,5 +86,49 @@ abstract class Estimator[M <: Model[M]] extends
PipelineStage {
paramMaps.map(fit(dataset, _))
}
+ /**
+ * (Java-specific)
+ */
+ @Since("2.3.0")
+ def fit(dataset: Dataset[_], paramMaps: Array[ParamMap],
+ unpersistDatasetAfterFitting: Boolean, executionContext:
ExecutionContext,
+ modelCallback: VoidFunction2[Model[_], Int]): Unit = {
+ // Fit models in a Future for training in parallel
+ val modelFutures = paramMaps.map { paramMap =>
+ Future[Model[_]] {
+ fit(dataset, paramMap).asInstanceOf[Model[_]]
--- End diff --
@MLnick I dicussed with @jkbradley @MrBago offline and here is the newest
proposal
https://docs.google.com/document/d/1xw5M4sp1e0eQie75yIt-r6-GTuD5vpFf_I6v-AFBM3M/edit?usp=sharing
Thanks!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]