A few comments, sprinkled within your post:

On 2 Dec 2002, David Robinson wrote in part:

> Each group for which I have all the data consists of a normal
> distribution.

Strictly speaking, this is not true, although the distribution may
indeed by *approximately* normal.  (Any normal distribution is
continuous and unbounded;  but any real (that is, observed) distribution
is bounded above and below -- in a k-item test, one cannot get fewer
than 0 nor more than k items right -- and is discrete in so far as data
are reported to a finite number of digits.)  It *may* be reasonable to
assume that the *latent* distribution underlying your data is
approximately normal, to within the precision of the data.

> I was assuming that, [in combining data sets] I could get another
> normal distribution.  I required the psychometric function of this
> combined data, [...] assuming normal distribution.
>
> However, even if all the individual groups exhibit a normal
> distribution (and I don't have data to prove this!), the combination
> certainly doesn't have to - and may not even be close if I choose to
> combine two groups with very different means (I believe this is one
> issue hinted at in the final post).

If you are dealing with a measure suitable for students in elementary
and/or secondary schools, and if the measure is expected to change
(increase?) systematically with age, you should not expect a normal
distribution over a range of ages.  If the true relationship is linear
with age (over some useful range of ages), the distribution of scores
would be expected to be approximately uniform (or rectangular) in the
middle of the range of scores, with tails at top and bottom that might
resemble normal tails (cut off from their parent distribution).
 The shape I have described would follow from several assumptions that
are (or at any rate used to be) commonly made:
 (a) the conditional distribution of test scores at a given age is
normal (aka Gaussian);
 (b) the (conditional) mean score increases linearly with age;
 (c) the population has been stable for some years, implying that the
subpopulations at each age are of (approximately) equal size.

We could otherwise phrase that as expecting conditional normal
distributions when the population ditribution by age is rectangular and
the mean score is linear with age.

> I have chosen to calculate the psychometric functions on a group by
> group basis. Thus, I have the %age of people in each group who would
> respond to a sound of a given intensity. Then, it is trivial to add
> the percentages across groups (at each intensity) weighting by the
> number of subjects in each group, or some other factor. I hope this
> isn't a terrible thing to do?

Not so far;  but THEN what are you going to do?  Your sequel suggests
that what you really want is the cumulative frequency distribution for
all groups (ages?) combined, and that you intend to calculate
percentiles (NOT "%ages", although that's what you write) from the
overall mean and SD, *assuming normality*;  rather than from the
empirical distribution that you get when you combine several putatively
normal distributions each from different ages (and therefore having
diffrerent means).

I may add that it is not at all clear to me what utility may reside in
knowing the percentiles for the combined data.  If we're dealing with
school children, nearly EVERYTHING that one does with children is
conditional on age.  That is treated (by nearly everyone) as the most
important single datum about any child:  if a child's letter to an
editor is published, it is ALWAYS accompanied by the child's age, for
example.  If I am a parent, I don't much care that 80% of kids in school
grades 1 to 12 get scores at or below X:  I want to know what scores are
reasonable for my 10-year old son and my 12-year-old daughter, and as a
person with some background in measurement & statistics I do not want
meaningless information (that X is the 80th percentile in a distribution
that purports to represent all kids from ages 6 to 18) to be presented
as though it meant something useful.

> I want to combine the data from different groups because:
> a) I have data from experiments that should be the same, but aren't!
> b) I have data from different age ranges, which should be different
> (and are!), but I wish to calculate %age values for the whole
> population (with a given demographic, which I can create using correct
> weightings of each age-band data-set).

If you do combine the data, are you not "sweeping under the rug" the
facts you have just mentioned (that some differences one would expect
were *not* observed, and that other differences one would expect *were*
observed)?

> I sincerely apologise for asking the wrong question. If my chosen
> method is totally indefensible, I hope you will correct me.

Well, as remarked above, much depends on just *what* this exercise is
*for*.  So far it looks to me much like an academic exercise (in more
than one sense of the adjective!).  But maybe I'm just being
curmudgeonly today.
                  -- DFB.
 -----------------------------------------------------------------------
 Donald F. Burrill                                            [EMAIL PROTECTED]
 56 Sebbins Pond Drive, Bedford, NH 03110                 (603) 626-0816
 [was:  184 Nashua Road, Bedford, NH 03110               (603) 471-7128]

.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to