Is the sum still a better measure of the "trait" than any individual item?
It is not clear what inferences can be drawn from the sum of the four items.
Reliability is the prerequisite of validity.
Whose face validity shall be relied on, given face validity is determined by
the eyes of the beholder?
Statistical conclusions rely on quality of psychometric properties:
reliability and validity. However, quality of psychometric properties
cannot be substantiated by significant statistical results in this case.
____________________________________________
Peter Chen
Industrial/Organizational Psychologist and Researcher
Liberty Mutual Research Center
71 Frankland Road, Hopkinton, MA 01748 USA
Tel. (508) 435-9061 x301 Fax. (508) 435-8136
E-mail: [EMAIL PROTECTED]
Web: http://www.libertymutual.com/research
____________________________________________
-----Original Message-----
From: [EMAIL PROTECTED] [SMTP:[EMAIL PROTECTED]]
Sent: Friday, December 10, 1999 12:22 PM
To: [EMAIL PROTECTED]
Subject: Re: Scale Reliability
You might like my reply.
Even with a low Alpha, the sum of the items is still a better
measure
of the trait (locus of control) than any individual item. This
assumes
that the items have 'face validity' as measures of locus of control.
It also assumes that whatever they do have in common is mainly the
locus-of-control trait (rather than, say, some other thing like
social
desirability).
If so, then the only real 'problem' is that your summed score is a
relatively weak measure of locus of control. (That is, it has
limited
validity--i.e., the correlation of the score with the true
trait--because reliability constrains the level of validity). But
that
means any statistical analysis you perform is *conservative*. That
is,
by using a weak measure of locus of control, you are 'stacking the
deck' against finding a significant relationship between locus of
control with the other variables in your study. Now, if you have
obtained statistically significant results with a weak measure of
locus
of control, then your results are still significant! In fact, one
could argue that they are even stronger since you have obtained
significant results with the deck stacked against finding them.
This principle is frequently overlooked. Ultimately, a scale only
has
to be as reliable as you need to find statistically significant
results
when comparing the scale with another construct.
So, to summarize, if you have obtained significant results with your
summed score, you can go back to your critics with confidence and
point
out that you have done so with a conservative analysis, and that had
you used a more 'reliable' scale, your results would only be
stronger.
Naturally this assumes you have obtained positive results. If you
have
obtained negative results (lack of correlation between the scale and
some other variable(s)), then clearly this logic does not apply.
One other thing to mention: one could set up the problem as a
LISREL-type model, in which the four items are multiple indicators
of a
common trait (locus of control.) Interestingly, in a
multiple-indicator type model, people rarely bring up the issue of
the
reliability of the common trait and how it is influenced by the
number
of indicators, although, logically, one would think it would apply
more
as less in the same way as adding the items to create an aggregate
score. This isn't to suggest that you do a LISREL analysis--it's
merely to point out a logical inconsistency in how people regard
multiple indicators.
John Uebersax
[EMAIL PROTECTED]
In article <AC09DC4F4DFCD211A83C00805FE6138D3691B9@NHQJPK1EX2>,
[EMAIL PROTECTED] (Magill, Brett) wrote:
> Just wanted people's thought on the following:
>
> I am a graduate student in sociology studying individual's
perceptions of
> control (locus of control) using existing data. The data set
include
four
> items to measure this construct which were taken from a larger
scale
of more
> than twenty, the larger scale reaching an acceptable level of
reliability (I
> do not know the exact level, but it is a widely researched and
used
> instrument) in previous research. The four items that were
included
were
> selected as the best measures of the construct based on empirical
evidence
> (item-total correlation's, factor analysis).
>
> In my own research, I used these items and decided to sum
responses
across
> these four likert-type items. However, the Alpha reliability is
very
low
> 0.30 (items were reverse scored as necessary and coding was
double-checked).
> I defended the decision to sum the items, despite the low Alpha,
based on
> the fact that they were selected from a larger set of items which
are
> internally consistent. In presenting my findings, I was heavily
criticized
> for this decision.
Sent via Deja.com http://www.deja.com/
Before you buy.