Hi,
 
We got a question about interpretating R-suqared.
 
The actual outputs for a test dataset is X=(x1,x2, ..., xn).
model 1 predicted the outputs as Y1=(y11,y12,..., y1n)
model n predicted the outputs as Y2=(y21,y22,..., y2n)
 
... 
model m predicted the outputs as Ym=(ym1,ym2,..., ymn)
 
Now we have two ways to calculate R squared to evaluate the average performance of 
committee model.
 
(a) Calculate R squared between (X, Y1), (X, Y2), ..., (X,Ym), and then averaging the 
R squared
(b) Calculate average Y=(Y1+Y2, + ... Ym)/m, and then calculate the R squared between 
(X, Y). 
 
We found it seemed that R squared calculated in (b) is 'always' higher than that in 
(a).
 
Does this result depends on the test dataset or this happened by chance?Can you advise 
me any reference for this issue? 

Many thanks in advance!

Kan

 


                
---------------------------------


        [[alternative HTML version deleted]]

______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to