Joseph K. Bradley created SPARK-5972:
----------------------------------------

             Summary: Cache residuals for GradientBoostedTrees during training
                 Key: SPARK-5972
                 URL: https://issues.apache.org/jira/browse/SPARK-5972
             Project: Spark
          Issue Type: Improvement
          Components: MLlib
    Affects Versions: 1.3.0
            Reporter: Joseph K. Bradley
            Priority: Minor


In gradient boosting, the current model's prediction is re-computed for each 
training instance on every iteration.  The current residual (cumulative 
prediction of previously trained trees in the ensemble) should be cached.  That 
could reduce both computation (only computing the prediction of the most 
recently trained tree) and communication (only sending the most recently 
trained tree to the workers).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to