Re: Sample Size Question

Stan Brown Wed, 22 Jan 2003 08:52:13 -0800

Eric Lund <[EMAIL PROTECTED]> wrote in sci.stat.edu:

> We really do not want to score all 1600 if we don't have to. 
[snip]
>So, what I would like to
>know is, can we get an approximation for the population mean without
>having any baseline data, and if so, how many students would need to
>be sampled?


[Please post right-side-up and trim quotes; see 
<http://web.presby.edu/~nnqadmin/nnq/nquote.html>.]

This prompts a further question. Putting the statistics aside for a 
moment, I'll bet that if Indiana mandates you develop the test 
locally the state also mandates what sort of data analysis you can 
do and whether you're allowed to score only a sample of tests. So be 
sure to eliminate that possibility.

That said, if you don't know the population standard deviation then 
you can't decide a priori how big a sample you need to compute a 
population mean to a desired level of confidence and with a desired 
margin of error.

What you _can_ do is score a reasonable number of them -- say 80 
(which is 1/20 of the population). Compute the mean and standard 
deviation of that sample, then use Student's t to compute a 
confidence interval. This is almost the first inferential procedure 
in any basic statistics textbook (or you can do it easily on a TI-83 
with STAT | TESTS | 8; it's a little more work in Excel).

You will end up with a statement of this form: "With __(a)__% 
confidence, the mean score of all 1600 tests is __(b)__ +/- 
__(c)___." (a) is a number you preselect (.95 or 95% is common, e.g. 
in political polls); (b) is the sample mean you computed; (c) is 
computed in the confidence interval procedure. For instance, if 
sample size is 80 and (a) is 95% then (c) is about .2225 times your 
sample standard deviation. [More formally, (c) is inverse t for one-
tailed area (1-95%)/2 = 0.025 and degrees of freedom = (sample 
size)-1, times your sample standard deviation, divided by the square 
root of your sample size.]

Now if your sample gives an unacceptably large margin of error, 
about all you can do is repeat with a larger sample. But if the 
sample is bigger than 5%-10% of the population, the assumptions that 
let you compute a confidence interval as described above begin to 
break down. There are techniques to deal with that problem, but I 
don't understand them well enough to explain them.

But again, I question whether this is legally acceptable. Remember 
all the flap two years ago about using statistical methods on the US 
Census figures? I'm not saying you're doing anything illegal, just 
counseling you to be sure where you stand (unless of course you've 
already investigated this legally).

-- 
Stan Brown, Oak Road Systems, Cortland County, New York, USA
                                  http://OakRoadSystems.com/
"My theory was a perfectly good one. The facts were misleading."
                                   -- /The Lady Vanishes/ (1938)
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Re: Sample Size Question

Reply via email to