[
https://issues.apache.org/jira/browse/SPARK-21086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16081500#comment-16081500
]
Joseph K. Bradley commented on SPARK-21086:
-------------------------------------------
I like the idea for that path, but it could become really long in some cases,
so I'd prefer to use indices instead for robustness.
Driver memory shouldn't be a big problem since all models are already collected
to the driver.
> CrossValidator, TrainValidationSplit should preserve all models after fitting
> -----------------------------------------------------------------------------
>
> Key: SPARK-21086
> URL: https://issues.apache.org/jira/browse/SPARK-21086
> Project: Spark
> Issue Type: New Feature
> Components: ML
> Affects Versions: 2.2.0
> Reporter: Joseph K. Bradley
>
> I've heard multiple requests for having CrossValidatorModel and
> TrainValidationSplitModel preserve the full list of fitted models. This
> sounds very valuable.
> One decision should be made before we do this: Should we save and load the
> models in ML persistence? That could blow up the size of a saved Pipeline if
> the models are large.
> * I suggest *not* saving the models by default but allowing saving if
> specified. We could specify whether to save the model as an extra Param for
> CrossValidatorModelWriter, but we would have to make sure to expose
> CrossValidatorModelWriter as a public API and modify the return type of
> CrossValidatorModel.write to be CrossValidatorModelWriter (but this will not
> be a breaking change).
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]