MSE is the mean squared residuals. For the training data, the OOB estimate is used (i.e., residual = data - OOB prediction, MSE = sum(residuals) / n, OOB prediction is the mean of predictions from all trees for which the case is OOB). It is _not_ the average OOB MSE of trees in the forest.
I hope there's no question about how the pseudo R^2 is computed on a test set? If you understand how that's done, I assume the confusion is only how the OOB MSE is formed. Best, Andy From: Dimitri Liakhovitski > > Dear Random Forests gurus, > > I have a question about R^2 provided by randomForest (for regression). > I don't succeed in finding this information. > > In the help file for randomForest under "Value" it says: > > rsq: (regression only) - "pseudo R-squared'': 1 - mse / Var(y). > > Could someone please explain in somewhat more detail how exactly R^2 > is calculated? > Is "mse" mean squared error for prediction? > Is "mse" an average of mse's for all trees run on out-of-bag > holdout samples? > In other words - is this R^2 based on out-of-bag samples? > > Thank you very much for clarification! > > -- > Dimitri Liakhovitski > MarketTools, Inc. > dimitri.liakhovit...@markettools.com > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Notice: This e-mail message, together with any attachme...{{dropped:12}} ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.