[ 
https://issues.apache.org/jira/browse/SPARK-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14391521#comment-14391521
 ] 

Manoj Kumar commented on SPARK-5972:
------------------------------------

[~josephkb] This should be done independently of evaluateEachIteration right? 
(In the sense, that evaluateEachIteration should not be used in the 
GradientBoostedTrees code that does this, that is caching the error and 
residuals, since the model has not been trained yet)



> Cache residuals for GradientBoostedTrees during training
> --------------------------------------------------------
>
>                 Key: SPARK-5972
>                 URL: https://issues.apache.org/jira/browse/SPARK-5972
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>    Affects Versions: 1.3.0
>            Reporter: Joseph K. Bradley
>            Priority: Minor
>
> In gradient boosting, the current model's prediction is re-computed for each 
> training instance on every iteration.  The current residual (cumulative 
> prediction of previously trained trees in the ensemble) should be cached.  
> That could reduce both computation (only computing the prediction of the most 
> recently trained tree) and communication (only sending the most recently 
> trained tree to the workers).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to