Hi again,
I am currently revisiting this problem after familiarizing myself with
Cython and Scikit-Learn's code and I have a very important query:
Looking at the class MSE(RegressionCriterion), the node impurity is defined
as the variance of the target values Y on that node. The predictions X are
Hi, Thomas,
in regression trees, minimizing the variance among the target values is
equivalent to minimizing the MSE between targets and predicted values. This is
also called variance reduction:
https://en.wikipedia.org/wiki/Decision_tree_learning#Variance_reduction
Best,
Sebastian
> On Mar 1
Hi Sebastian,
Going back to Pearson's R loss function, does this imply that I must add an
abstract "init2" method to RegressionCriterion (that's where MSE class
inherits from) where I will add the target values X as extra argument? And
then the node impurity will be 1-R (the lowest the best)? What
Hi, Thomas,
as far as I know, it's all the same and doesn't matter, and you would get the
same splits, since R^2 is just a rescaled MSE.
Best,
Sebastian
> On Mar 1, 2018, at 9:39 AM, Thomas Evangelidis wrote:
>
> Hi Sebastian,
>
> Going back to Pearson's R loss function, does this imply th
Does this generalize to any loss function? For example I also want to
implement Kendall's tau correlation coefficient and a combination of R, tau
and RMSE. :)
On Mar 1, 2018 15:49, "Sebastian Raschka" wrote:
> Hi, Thomas,
>
> as far as I know, it's all the same and doesn't matter, and you would
Unfortunately (or maybe fortunately :)) no, maximizing variance reduction &
minimizing MSE are just special cases :)
Best,
Sebastian
> On Mar 1, 2018, at 9:59 AM, Thomas Evangelidis wrote:
>
> Does this generalize to any loss function? For example I also want to
> implement Kendall's tau corr