Re: correlations as "dependent measures"

Rich Ulrich Tue, 14 May 2002 12:08:35 -0700

On Tue, 7 May 2002 11:08:26 +0200, "Johannes Hartig"
<[EMAIL PROTECTED]> wrote:


> Hi all,
> I have the following problem which I don't find an actually
> satisfying statistical method for:
> The item discrimination index for questionnaire items (i.e.
> the corrected item-total-correlation) is supposed to increase
> with the number of similar items answered (when controlling
> the item content). In several papers these "item context effects"
> are tested by correlating (fisher-z-transformed) item discrimination
> with item position in the quesionnaire (e.g. Knowles, 1988*).
> What leaves me somehow unsatisfied about this technique is that
> the "sample size" used for this analysis equals the number of
> questionnaire items, not the number of respondents.

 - The subject of the analysis is a certain set of items.
If you test each item in only one position, then you have 
each item just once.  That is an awfully weak design, which
I would be tempted to call an AWFUL design.  

If the item-positioins were not assigned randomly, *then*  
I would give in to temptation:  Awful.

> What I'm looking for is a method to test the _trend_ in
> correlations of a series of variables x(1), x(2), ... x(k) with
> another variable y. I know how to test the null hypothesis that
> all k correlations are equal, but this is not exactly the question
> I'm trying to answer. Is there any way to test the trend in series
> of correlations based on the raw data, i.e. that uses the power
> of the original sample size? Or is the correlation between k and
> r(x(k),y) an adequate procedure, even if this means a "sample"
> size of e.g. 36 items that were originally answered by 1200
> respondents?

Well, as I mentioned, the correlation between r and k  is 
not adequate, either, if the positions were not assigned
by chance.

Yes, you can test the correlations for homogeneity, but 
that is odd test, in a way -- for many purposes, including
what you are tackling today, the reasonable assumption 
is that those correlations *are*  different, with no argument
needed.    But I am a little  bit ambivalent.  I don't think
I have explained - to myself - what makes one test based on
N=1200  "okay," and another one, "not-okay".  

There is something here that is like using the 'pooled-within'
variance, across all the polynomial effects,  to analyse the 
linear trend in a repeated measures design:  I think that it 
is mostly wrong,  but not 100% of the time.


I know it is better to have
 - a larger N, and 
 - a larger difference in correlations, and 
 - a large number of correlations.

Here is a counter-example:  Here is one reason that
you do need to look at the individual correlations, which deserve
to have a large enough N  to have reliability --

If you had just 5 correlations, grading merely  from  .51 to .55,
with N=25, I would believe *every*  outcome -- any comparison
based on those r's --  was chance.  However, if this 
 precise example had such a good correlation with k  that 
the p-level was .0001 or smaller, I would believe that someone 
was lying about the design, or somehow cheating, or hitting
the RARE chance:  There was not enough 'information'  with N=25
and r's from .51 to .55   to demonstrate any effect, except by
accident.

So:  You need a large enough reliability in the measures
to justify the accuracy claimed - or assumed - by the test
that is being performed.  When results come out *too good*,
you do report them; but it is appropriate to warn that the
effect  was, somehow, in your opinion,  outside of chance

[I know that this reminds me of 'over-interpreting' in some 
other contexts, and I would be pleased if someone would
cite  parallels.]


-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Re: correlations as "dependent measures"

Reply via email to