Joseph K. Bradley created SPARK-5972:
----------------------------------------
Summary: Cache residuals for GradientBoostedTrees during training
Key: SPARK-5972
URL: https://issues.apache.org/jira/browse/SPARK-5972
Project: Spark
Issue Type: Improvement
Components: MLlib
Affects Versions: 1.3.0
Reporter: Joseph K. Bradley
Priority: Minor
In gradient boosting, the current model's prediction is re-computed for each
training instance on every iteration. The current residual (cumulative
prediction of previously trained trees in the ensemble) should be cached. That
could reduce both computation (only computing the prediction of the most
recently trained tree) and communication (only sending the most recently
trained tree to the workers).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]