Mike Franke wrote (in part):

> "A school district had anticipated that the percentages of boys and
> girls who planned no further education would be the same,
> approximately 44% for all of the students.  Two independent random
> samples of the seniors at a high school are taken; the first was a
> sample of 10 boys and the second ws a sample of 25 girls.  The boys'
> sample indicated that 50% of them planned no further education after
> graduation, while the gurls' sample indicated that only 40% of them
> planned no further education after graduation.  Which of the following
> is valid for this information?"
> 
> The solution explanation (in the back of the book, p349) states that
> "the small sample size indicates that we cannot assume normality, but
> the formula for the standard deviation is true regardless of the sape
> of the distribution."

(solutions:)
> 
>    A) The sampling distribution is approximately normal with mean 0
> and approximate standard deviation .1857.
>    B) The sampling distribution is approximately normal with mean .1
> and approximate standard deviation .0345
>    C) No conclusion can be drawn regarding the sampling distribution
> since the samples are taken from the same population.
>    D) No conclusion can be drawn regarding the sampling distribution
> since the sample size of the boys' sample is too small
>    E) None of these statements is valid.

        Where to start? Sample size is not a problem, even for the boys' sample
analyzed on its own - the usual rule is np, nq >= 5.  This means that
while the (binomial) sampling distribution is not normal, it is "close
enough for rock'n'roll".


        (1) The question is not well phrased. Is the null hypothesis meant to 
be that the proportions are the same for boys and girls? Three other
interpretations are possible: the bivariate null hypothesis that the
proportions for both boys and girls are equal to 44%, and the
conditional null hypothesis that they are equal, given that the overall
proportion in the school is 44%. (This last one cannot be analyzed
properly without knowing the sex ratio in the school.) 

        A final awful possibility - which almost fits the mean suggested in A,
though not the standard deviation - is that this is a stratified sample
being misanalyzed as a simple random sample, for the null hypothesis
that the overall mean is 44%.

        (2) Assuming the first interpretation, the sampling distribution would
be for the difference of sample proportions, and *would* be 0.1. 

        The standard deviation to use depends on whether you are doing a
hypothesis test - when you ASSUME the null, if it suits you; here you do
because it lets you pool the data in calculating s, "for the sake of
argument". This gives a SD somewhere around 0.17, I can't be bothered to
work out the details. The one you get using the confidence-interval
calculation, which does NOT assume poolability, is slightly larger.
Quite possibly 0.1857.

  It's large enough that there isn't a cat's chance of rejecting the
null, BTW (which would need a difference of around 2s), and a CI will be
absurdly wide. The sample sizes ARE far too small for a useful study,
and could have been predicted to be too small ahead of time,  making the
example silly. In particular, the uneven division is hideously
inefficient.  Students should _never_ be exposed to this sort of poor
practice without very careful explanation of why it's bad.

        So, as far as I can see, (A) has the wrong mean OR the wrong sampling
technique; (B) has the wrong standard deviation (you would need sample
sizes in the hundreds to get this small a SD). (C) is simply wrong.  (D)
is marginal but most statisticians would accept the distributional
assumptions for this purpose. I'd go with (E), and suggest that the
question should not ever be used in the classroom, for multiple reasons.

        -Robert Dawson
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to