[
https://issues.apache.org/jira/browse/SPARK-6004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14337762#comment-14337762
]
Joseph K. Bradley commented on SPARK-6004:
------------------------------------------
I'm not too worried about stopping early when users call runWithValidation().
If they know enough to use a validation set, then I think it's reasonable to
expect them to set validationTol according to what they need. This was
discussed a little in [SPARK-5972], where the decision was to provide a helper
method for users to do validation post-hoc: [SPARK-6025]
I feel like the current default behavior is good since it chooses efficiency,
while still leaving the option to do more expensive training with potentially
higher accuracy.
> Pick the best model when training GradientBoostedTrees with validation
> ----------------------------------------------------------------------
>
> Key: SPARK-6004
> URL: https://issues.apache.org/jira/browse/SPARK-6004
> Project: Spark
> Issue Type: Improvement
> Components: MLlib
> Reporter: Liang-Chi Hsieh
> Priority: Minor
>
> Since the validation error does not change monotonically, in practice, it
> should be proper to pick the best model when training GradientBoostedTrees
> with validation instead of stopping it early.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]