Re: [R] Question about randomForest

2012-04-04 Thread Liaw, Andy
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Saruman I dont see how this answered the original question of the poster. He was quite clear: the value of the predictions coming out of RF do not match what comes out of the predict function using

Re: [R] Question about randomForest

2012-04-03 Thread Saruman
I dont see how this answered the original question of the poster. He was quite clear: the value of the predictions coming out of RF do not match what comes out of the predict function using the same RF object and the same data. Therefore, what is predict() doing that is different from RF? Yes, RF

Re: [R] Question about randomForest

2011-11-28 Thread Liaw, Andy
-project.org Subject: Re: [R] Question about randomForest Matthew, Your intepretation of calculating error rates based on the training data is incorrect. In Andy Liaw's help file err.rate-- (classification only) vector error rates of the prediction on the input data, the i-th element being

Re: [R] Question about randomForest

2011-11-27 Thread Matthew Francis
Thanks for the help. Let me explain in more detail how I think that randomForest works so that you (or others) can more easily see the error of my ways. The function first takes a random sample of the data, of the size specified by the sampsize argument. With this it fully grows a tree resulting

Re: [R] Question about randomForest

2011-11-27 Thread Ken
I am pretty sure that when each tree is fitted the error rate for tree 'i' is it's performance on the data which was not used to fit the ith tree (OOB). In this way cross validation is performed for each tree but I do not think that all trees fitted prior are involved in the computation of that

Re: [R] Question about randomForest

2011-11-27 Thread Weidong Gu
Matthew, Your intepretation of calculating error rates based on the training data is incorrect. In Andy Liaw's help file err.rate-- (classification only) vector error rates of the prediction on the input data, the i-th element being the (OOB) error rate for all trees up to the i-th. My

[R] Question about randomForest

2011-11-26 Thread Matthew Francis
I've been using the R package randomForest but there is an aspect I cannot work out the meaning of. After calling the randomForest function, the returned object contains an element called prediction, which is the prediction obtained using all the trees (at least that's my understanding). I've

Re: [R] Question about randomForest

2011-11-26 Thread Weidong Gu
Hi Matthew, The error rate reported by randomForest is the prediction error based on out-of-bag OOB data. Therefore, it is different from prediction error on the original data since each tree was built using bootstrap samples (about 70% of the original data), and the error rate of OOB is likely