[
https://issues.apache.org/jira/browse/SPARK-4240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiangrui Meng updated SPARK-4240:
---------------------------------
Target Version/s: (was: 1.3.0)
> Refine Tree Predictions in Gradient Boosting to Improve Prediction Accuracy.
> ----------------------------------------------------------------------------
>
> Key: SPARK-4240
> URL: https://issues.apache.org/jira/browse/SPARK-4240
> Project: Spark
> Issue Type: New Feature
> Components: MLlib
> Affects Versions: 1.3.0
> Reporter: Sung Chung
>
> The gradient boosting as currently implemented estimates the loss-gradient in
> each iteration using regression trees. At every iteration, the regression
> trees are trained/split to minimize predicted gradient variance.
> Additionally, the terminal node predictions are computed to minimize the
> prediction variance.
> However, such predictions won't be optimal for loss functions other than the
> mean-squared error. The TreeBoosting refinement can help mitigate this issue
> by modifying terminal node prediction values so that those predictions would
> directly minimize the actual loss function. Although this still doesn't
> change the fact that the tree splits were done through variance reduction, it
> should still lead to improvement in gradient estimations, and thus better
> performance.
> The details of this can be found in the R vignette. This paper also shows how
> to refine the terminal node predictions.
> http://www.saedsayad.com/docs/gbm2.pdf
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]