Ben Kenward wrote:
> My girlfriend is researching teaching methods using a questionnaire, and she
> has answers for questions in the form of numbers from 1 to 5 where 5 is
> strongly agree with a statement and 1 is strongly disagree. She is proposing
> to do a t-test to compare, for example, male and female responses to a
> particular question.
>
> I was surprised by this because I always thought that you needed at least
> interval data in order for a t-test to be valid. Her textbook actually says
> it is OK to do this though. I don't have any of my old (life-sciences) stats
> books with me, so I can't check what I used to do.
>
> So are the social scientists playing fast and loose with test validity, or
> is my memory playing up?
Classic issue, frequent discussion, careful response distinctions needed.
Yes, interval data is needed to do a t test. Is the data from a Likert
scale (what your friend has) interval data? Depends on how you see it.
When a respondent puts a mark halfway between two check boxes (i.e., 3.5
on the numerical scale), they are trying to tell you that _they_ see it
as interval, as continuous in fact.
What is the '3' position? Is it really between 2 and 4, or is it 'none
of the above' type of thing? If the latter, it's no dice - not interval.
for a t test, you really want intervals that are equally spaced. Is
this so? Is this reasonably close to so? Lots more debate on that. By
making the levels marked as points on the continuum from the 1 to the 5
positions, you are implying that they are equally spaced. Does the
respondant see them that way? Could be. maybe we should just try it,
to see what comes out.
For a t test, you prefer a scale which is in principle potentially
infinite. When I do this sort of thing, I sometimes get responses of 0
and 6, for potential conditions I didn't anticipate. Otherwise, the
scale is restricted at the bottom and top. How to correct for this?
One way is to do a logit transform (if I get the term right)
Convert the 1 - 5 scale into a 0 to 1 scale by: y' = (y-1)/4
then a logit transform (omega transform via Taguchi):
y'' = ln(y'/(1-y')
the y'' distribution will more closely approach the infinite width
potential requested, and will never give you a prediction of more than 5
or less than 1 on the y scale.
BUT... this assumes that the earlier assumptions about scale and
interval size are very tight. They probably aren't. Why waste your
time doing very precise analyses on weak data?
Suggestion:
(a) run the t test on the raw responses, y's. See if anything pops up.
(b) go back and check that the assumption requirements are met or at
least arguable. Check some respondents to see that they saw the scale
as you did, and adjust your thinking to theirs.
(c) IF you have time and the data is reasonably tight, AND if you
want to impress someone with your transformational skills, then go do
that transform and re-analyze. In most cases, the conclusions will not
be greatly different, in my experience. the only place things get dicey
is when a mean response is near the ends (1 or 5). Detecting
differences there can be harder, and a small change there is more
significant than a small change in the middle.
references? sorry. I've only done it a couple times, and know it
works - it gets me predictions that pan out in confirmation. treating
the data as nominal, instead of interval, may give away information.
that's expensive.
good luck,
Jay
--
Jay Warner
Principal Scientist
Warner Consulting, Inc.
4444 North Green Bay Road
Racine, WI 53404-1216
USA
Ph: (262) 634-9100
FAX: (262) 681-1133
email: [EMAIL PROTECTED]
web: http://www.a2q.com
The A2Q Method (tm) -- What do you want to improve today?
=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
http://jse.stat.ncsu.edu/
=================================================================