[tips] Re: Question from a student

Rick Froman Wed, 06 Sep 2006 09:35:25 -0700

Yes, I did mention to him that statistical studies have shown N-1 to be the 
best estimate but, since the class meets in a computer lab, I think a great way 
to do that would be to use Excel to generate a bunch of random samples (with a 
known pop sd) and then calculate a number of sds using N, N-1 and others to see 
which is the best (most unbiased) estimate. Thank you.
 
Rick
 
 
Dr. Rick Froman
Psychology Department
Box 3055
John Brown University
Siloam Springs, AR 72761
(479) 524-7295
[EMAIL PROTECTED]
"Pete, it's a fool that looks for logic in the chambers of the human heart"
- Ulysses Everett McGill

________________________________

From: Claudia Stanny [mailto:[EMAIL PROTECTED]
Sent: Wed 9/6/2006 11:02 AM
To: Teaching in the Psychological Sciences (TIPS)
Subject: [tips] Re: Question from a student

This is the intuitive explanation I give for degrees of freedom - how many 
numbers are really free to vary randomly when they must sum to a particular 
value.

The other part of Rick's question refers to the bias of a sample statistic. If 
you compute the standard deviation (or variance based on N), the sample 
statistic systematically underestimates the population parameter (this is why 
it is called a biased statistic). There is an algebraic proof that computing 
variance using N-1 produces a sample statistic that is an unbiased estimate of 
the population variance (the long run average of sample variances computed 
using N-1 will be equal to the value of the population variance whereas the 
long run average of sample variances computed using N will always be smaller 
than the population variance).

Claudia Stanny

________________________________

From: Steven Specht [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, September 06, 2006 10:38 AM
To: Teaching in the Psychological Sciences (TIPS)
Cc: Laurence Roberts; Arlene Lunquist; Della Ferguson; Elise Pepin Pepin
Subject: [tips] Re: Question from a student

Rick, 

Briefly (and I can elaborate if needed), the N-1 "formula" is also referred to 
as the degrees of freedom and is derived from the fact that given any set of 
numbers, if you assume (and it is a "strong" assumption) that the best single 
value guess for an unknown population mean is given by the mean of that sample, 
then what remains to infer about is the variability of the data set. So, for 
example, two data sets can have the same mean but different variabilities (of 
course). Let's say you have a data set with 5 numbers in it and calculate a 
particular mean. This would be the inferred mean of the unknown population from 
which the sample was taken. Now you need to make a guess at the variability. If 
you start "making up" numbers which might comprised a sample of five (BUT HAVE 
AS YOUR LIMITATION THE ORIGINAL INFERRED MEAN), you can make up any four 
numbers (they are free to vary)... after which the fifth number is dictated 
(given the values of the other four and retaining the inferred mean). If it was 
a theoretical group of 23 numbers, 22 would be "free to vary" and the last 
would be dictated by the other 22. Therefore N-1 = degrees of freedom. 

I know this might not be as clear as I could do the explanation given more time 
(maybe I should work up a good one). Try this exercise in class. Make up a 
sample mean from, let's say, a sample of 7 unkown scores. Ask aone student to 
provide a potential single score; "Can this be one of the scores and still have 
a sample mean of whatever it is that you made up"? "Yes".... keep going 
one-by-one. You'll find that the answer is "yes" everytime, except for the last 
number which is then mathematically "restricted"/dictated by the previous 6. 
Viola, degrees of freedom (N-1). It's not as arbitrary as it seems to students. 

Hope this helps. 

-S 

On Sep 6, 2006, at 11:19 AM, Rick Froman wrote: 

        I hope that subject line isn't copyrighted. 

        After I explained why the formula for the s to predict s uses N-1 in 
the denominator (to inflate it for a more conservative estimate since it is 
just an estimate of the population standard deviation), a student asked, why 
N-1 and not N-2 or N-3? I mentioned statistical studies about how N-1 gives the 
best estimate of the population standard deviation but I wonder if anyone has a 
good explanation for why it is N-1. I know if the number got too high, small 
sample sizes would end up with a negative number (which would make no sense). 

        Rick 

        Dr. Rick Froman, Chair 

        Division of Humanities and Social Sciences 

        Professor of Psychology 

        John Brown University 

        2000 W. University 

        Siloam Springs, AR  72761 

        [EMAIL PROTECTED] 

        (479) 524-7295 

        http://www.jbu.edu/academics/hss/psych/faculty.asp 

        "Pete, it's a fool that looks for logic in the chambers of the human 
heart." 

        - Ulysses Everett McGill 

        --- 

        To make changes to your subscription go to: 

http://acsun.frostburg.edu/cgi-bin/lyris.pl?enter=tips&text_mode=0&lang=english 

======================================================== 

Steven M. Specht, Ph.D. 

Associate Professor of Psychology 

Utica College 

Utica, NY 13502 

(315) 792-3171 

"Mice may be called large or small, and so may elephants, and it is quite 
understandable when someone says it was a large mouse that ran up the trunk of 
a small elephant" (S. S. Stevens, 1958) 

---
To make changes to your subscription go to:
http://acsun.frostburg.edu/cgi-bin/lyris.pl?enter=tips&text_mode=0&lang=english

<<winmail.dat>>

---
To make changes to your subscription go to:
http://acsun.frostburg.edu/cgi-bin/lyris.pl?enter=tips&text_mode=0&lang=english

[tips] Re: Question from a student

Reply via email to