Your post makes it seem unclear that kappa is the right statistic.
 
Usually one uses kappa when each rater/clinician rates a sample of
patients or cases.  But you merely describe a questionnaire (sp?) that
each clinician completes.  Assuming each clinician completes the
questionnaire only one time (as opposed to, say, one time in relation to
each of a sample of patients), then I don't see that kappa is
appropriate.  Instead, one would use simpler statistics--such as
calculating the standard deviation, across clinicians, for each item.
 
You also raise the issue of many possible rater pairs--note, though,
that there are only (16 * 15) / 2 = 120 unique pairs that involve
different raters.  Rather than calculate 120 different kappa
coefficients, a simpler alternative might be to calculate the general
kappa that measures agreement between any two raters--considering all
raters simultaneously.  That is done with Fleiss' kappa (as opposed to
Cohen's kappa, which only applies for pairwise comparisons).  For a
discussion of the difference between these two types of kappa, see
Joseph Fleiss, Statistical Methods for Rates and Proportions, 1981.
 
--
John Uebersax
[EMAIL PROTECTED]


===========================================================================
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================

Reply via email to