Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/19208#discussion_r148371489
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/param/shared/SharedParamsCodeGen.scala
---
@@ -82,7 +82,11 @@ private[shared] object SharedParamsCodeGen {
"all instance weights as 1.0"),
ParamDesc[String]("solver", "the solver algorithm for optimization",
finalFields = false),
ParamDesc[Int]("aggregationDepth", "suggested depth for
treeAggregate (>= 2)", Some("2"),
- isValid = "ParamValidators.gtEq(2)", isExpertParam = true))
+ isValid = "ParamValidators.gtEq(2)", isExpertParam = true),
+ ParamDesc[Boolean]("collectSubModels", "whether to collect a list of
sub-models trained " +
--- End diff --
Some more explanation will be nice:
If set to false, then only the single best sub-model will be available
after fitting.
If set to true, then all sub-models will be available. Warning: For large
models, collecting all sub-models can cause OOMs on the Spark driver.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]