[ https://issues.apache.org/jira/browse/SPARK-21086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16112623#comment-16112623 ]
Nick Pentreath commented on SPARK-21086: ---------------------------------------- I just want to understand _why_ folks want to keep all the models? Is it actually the models (and model data) they want, or a way (well, easier "official API" way) to link the param permutations with the cross-val score to see what param combinations result in what scores? (In which case, https://issues.apache.org/jira/browse/SPARK-18704 is actually the solution). > CrossValidator, TrainValidationSplit should preserve all models after fitting > ----------------------------------------------------------------------------- > > Key: SPARK-21086 > URL: https://issues.apache.org/jira/browse/SPARK-21086 > Project: Spark > Issue Type: New Feature > Components: ML > Affects Versions: 2.2.0 > Reporter: Joseph K. Bradley > > I've heard multiple requests for having CrossValidatorModel and > TrainValidationSplitModel preserve the full list of fitted models. This > sounds very valuable. > One decision should be made before we do this: Should we save and load the > models in ML persistence? That could blow up the size of a saved Pipeline if > the models are large. > * I suggest *not* saving the models by default but allowing saving if > specified. We could specify whether to save the model as an extra Param for > CrossValidatorModelWriter, but we would have to make sure to expose > CrossValidatorModelWriter as a public API and modify the return type of > CrossValidatorModel.write to be CrossValidatorModelWriter (but this will not > be a breaking change). -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org