Re: paired t-test for test-retest reliability reference?

Richard Ulrich Mon, 17 May 2004 08:49:17 -0700

[I'm top-posting a couple of comments, and deleting most of my
own post that was cited.]


You seem to make one of my points -- that the popular ICCs
will cover up mean differences, which might or might not be
interesting.  It may also cover up a single poorly correlated 
rater, among multiple raters.  That can be good for planning
for new raters, but it is not-so-good for training raters or for
reporting results in full.


On 13 May 2004 07:18:47 -0700, [EMAIL PROTECTED] (Paul R Swank)
wrote:

> First, let's consider the 2 observation case. I have 2 assessments of a
> behavior rating taken 20 minutes apart; I wish to know how reliable the
> assessments are. There are two potential sources of error, the relative
> error over time, in which the order of scores for subject a and subject b on
> the two assessments may be the same or different, and the absolute error in
> which all subjects may be lower on the second assessment. If I do a Pearson
> correlation between the two, I find a correlation of .78097 (n=313, p <
> .0001). I do an analysis of variance with repeated measures on time (the
> equivalent of the paired t-test, and find a significant difference between
> the means (time 1, mean = 3.377, sd=1.10; mean 2 = 3.291, sd=1.16; F(1, 312)
> = 4.16; p = .0422). Now, I do a generalizability analysis. I find the
> following variance components:
> 
> Subjects                              .99269
> Time                                  .00300
> Subjects by Time                      .27842
> 
> The generalizability coefficient (or ICC) considering only the relative
> error (interaction) is
> 
> .99269 / (.99269 + .003) = .99269/1.27111 = .78096 which is the Pearson
   - oops! for that first denominator -
> Correlation within rounding. I then figure the coefficient taking into
> account the mean difference as well.
> 
> .99269 / (.99269 + .003 + .27842) = .99269 / 1.27411 = .779.
> 
> I have had a minimal effect on the reliability as should be obvious by the
> variance component for time, which is very small relative to the other
> variance components. 
> 
> Thus, even though the difference between time 1 and 2 is significant (due in
> part to the large sample and the strong correlation between two observations
> taken 20 minutes apart), the effect on the reliability is small. Of course,
> I could observe that in the means as well, since they re very close, but of
> course, when you see two means, many people want to know if they are
> statistically different. 
> 
> Add to this result, the fact that, because in reality I have 5 assessments
> of the observed variable over an hour's time, the generalizability result is
> much easier to deal with than is 10 unique Pearson correlations and an ANOVA
> (hopefully not 10 paired t-tests), and it becomes clear that the
> generalizability analysis is cleaner than breaking the analysis into two
> parts.
> 
[snip sig.]
> 
> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
> Behalf Of Richard Ulrich
> Sent: Wednesday, May 12, 2004 2:52 PM

[snip, his and mine]
RU >
> Yes, it is the overall impact, and that can be useful for the
> *final* statement, especially when a very precise statement of 
> overall impact is warranted -- because, for instance, power analyses are
> being based on the exact value of the exact form of ICC that is needed: Same
> versus different raters; single versus 
> multiple scorers.  
> 
> And I think it is an over-generalization to prefer an ICC when the issue is
> the cruder one of apparent adequacy.  The ICC is less informative (about
> means) and less transparent (multiple versions available to select, all of
> them burying the means).
> 
> [snip, rest]

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Re: paired t-test for test-retest reliability reference?

Reply via email to