On Wed, 26 Nov 2003 13:24:39 -0500, Rajarshi Guha
<[EMAIL PROTECTED]> wrote:

> Hello,
>   I was wondering whether anybody whould be able to help with this query.
> 
> 
> I have some neural network models which makes predictions for a dataset. When
> comparing various models we evalute the effectiveness by looking the RMS
> error and the value of R^2 between the predicted and actual values.
> 
> However, I seem to have read somewhere that R^2 is not always a 'good
> indicator' - in that a data set can be randomly generated yet show a good
> R^2. Is this true? And if so, does anybody know how I can reference this
> (paper/book)?

In a simple OLS regression, where you have not done
any preselection of variables, the expected value of R^2  is 
equal to the number of (random) variables, divided by the (N-1).  
This is what the correction is for, when you read about the
"adjusted R-squared"  that a regression program gives you.
So,  a random set of 10 variables predicting 21 cases gives 
you an R-squared (by chance alone) of 0.50.  

Now, if you screened out some variables before hand, 
then there are papers saying that you should consider the
*starting* number of variables as the source of bias.

See any book on regression; this is one of those facts that
should not need much specific mention or defense in your
own use of it.  
Does this deal with what you had in mind?



-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html
"Taxes are the price we pay for civilization." 
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to