[jira] [Commented] (SPARK-6004) Pick the best model when training GradientBoostedTrees with validation

Joseph K. Bradley (JIRA) Wed, 25 Feb 2015 18:59:07 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-6004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14337762#comment-14337762
 ]


Joseph K. Bradley commented on SPARK-6004:
------------------------------------------

I'm not too worried about stopping early when users call runWithValidation().  
If they know enough to use a validation set, then I think it's reasonable to 
expect them to set validationTol according to what they need.  This was 
discussed a little in [SPARK-5972], where the decision was to provide a helper 
method for users to do validation post-hoc: [SPARK-6025]

I feel like the current default behavior is good since it chooses efficiency, 
while still leaving the option to do more expensive training with potentially 
higher accuracy.

> Pick the best model when training GradientBoostedTrees with validation
> ----------------------------------------------------------------------
>
>                 Key: SPARK-6004
>                 URL: https://issues.apache.org/jira/browse/SPARK-6004
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>            Reporter: Liang-Chi Hsieh
>            Priority: Minor
>
> Since the validation error does not change monotonically, in practice, it 
> should be proper to pick the best model when training GradientBoostedTrees 
> with validation instead of stopping it early.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-6004) Pick the best model when training GradientBoostedTrees with validation

Reply via email to