[
https://issues.apache.org/jira/browse/SPARK-21086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16112623#comment-16112623
]
Nick Pentreath commented on SPARK-21086:
----------------------------------------
I just want to understand _why_ folks want to keep all the models? Is it
actually the models (and model data) they want, or a way (well, easier
"official API" way) to link the param permutations with the cross-val score to
see what param combinations result in what scores? (In which case,
https://issues.apache.org/jira/browse/SPARK-18704 is actually the solution).
> CrossValidator, TrainValidationSplit should preserve all models after fitting
> -----------------------------------------------------------------------------
>
> Key: SPARK-21086
> URL: https://issues.apache.org/jira/browse/SPARK-21086
> Project: Spark
> Issue Type: New Feature
> Components: ML
> Affects Versions: 2.2.0
> Reporter: Joseph K. Bradley
>
> I've heard multiple requests for having CrossValidatorModel and
> TrainValidationSplitModel preserve the full list of fitted models. This
> sounds very valuable.
> One decision should be made before we do this: Should we save and load the
> models in ML persistence? That could blow up the size of a saved Pipeline if
> the models are large.
> * I suggest *not* saving the models by default but allowing saving if
> specified. We could specify whether to save the model as an extra Param for
> CrossValidatorModelWriter, but we would have to make sure to expose
> CrossValidatorModelWriter as a public API and modify the return type of
> CrossValidatorModel.write to be CrossValidatorModelWriter (but this will not
> be a breaking change).
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]