I have a scenario where three people were set off to score 100+ cases -- giving each a score of 1, 2, 3, or 4 (categories of disease). These folks were subsequently educated and then given 100+ different cases for which they did the same exercise. The theory is that the post-education set of observations will be more consistent across observers.
I took the two datasets and ran them against SAS's %Magree macro and received the results listed below. I don't know statistics well enough to judge whether the two sets -- pre- and post-education -- are statistically different in terms of the interobserver consistency. Can someone take a look and educate me? I'd really appreciate it...Gigi KAPPA SCORES BEFORE EDUCATION MAGREE macro run for before_edu Kappa statistics for nominal response classification Kappa Error z Prob>Z 1 0.82270 0.057735 14.2495 <.0001 2 0.12203 0.057735 2.1135 0.0173 3 0.50559 0.057735 8.7571 <.0001 4 0.75598 0.057735 13.0940 <.0001 Overall 0.57361 0.041025 13.9818 <.0001 KAPPA SCORES AFTER EDUCATION MAGREE macro run for after_edu Kappa statistics for nominal response classification Kappa Error z Prob>Z 1 0.85555 0.047619 17.9666 <.0001 2 0.40527 0.047619 8.5106 <.0001 3 0.33144 0.047619 6.9602 <.0001 4 0.76610 0.047619 16.0882 <.0001 Overall 0.58435 0.027792 21.0259 <.0001 . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
