Once again, Mr. Ulrich produces a long string of
ad hominems and completely unmotivated statistical
procedures.
We're still waiting calmly for that "rationale" for
doing the t-test...on non-randomly sampled populations.
To be perfectly clear, the Steiger-Hausman report did
not base its judgment on "means." It presented all the
raw data, not just for citations, but also for publications,
and a host of impact indices. [Mr. Ulrich never
read the report during the first weeks of the discussion.]
The inference that there are huge performance differences
between the senior men and women at MIT is supported by
publication rates, grant awards, citation counts, and
impact ratios, not to mention other factors we did not
investigate directly.
Mr. Ulrich, taking a cue from Dr. Gallagher, mischaracterizes
the nature of our report to make it sound like we were
reaching inferences directly from means. As I've
explained to Jim Clark, the use of means is not necessary or
even crucial to our argument.
Mr. Ulrich not only is unable to justify doing a t-test,
but is also unable to justify the use of outlier adjustment
procedures.
I know this question is going to draw a stony silence, but
I'll ask it anyway.
On what theoretical basis did you "subtract 5000 off the 3 largest
values"? Why would you want to do that, and examine the impact of
doing it on a t-test, when you have (apparently) no idea why you
want to do a t-test?
Best regards,
Jim Steiger
--------------
James H. Steiger, Professor
Dept. of Psychology
University of British Columbia
Vancouver, B.C., V6T 1Z4
-------------
Note: I urge all members of this list to read
the following and inform themselves carefully
of the truth about the MIT Report on the Status
of Women Faculty.
Patricia Hausman and James Steiger Article,
"Confession Without Guilt?" :
http://www.iwf.org/news/mitfinal.pdf
Judith Kleinfeld's Article Critiquing the MIT Report:
http://www.uaf.edu/northern/mitstudy/#note9back
Original MIT Report on the Status of Women Faculty:
http://mindit.netmind.com/proxy/http://web.mit.edu/fnl/
On Thu, 15 Mar 2001 11:28:38 -0500, Rich Ulrich <[EMAIL PROTECTED]>
wrote:
>Neal,
>I did intend to respond to this post -- you seem serious about this,
>more so than "Irving."
>
>On 13 Mar 2001 22:36:03 GMT, [EMAIL PROTECTED] (Radford Neal)
>wrote:
>
>[ snip, .... previous posts on what might be tested ]
>>
>> None of you said it explicitly, because none of you made any coherent
>> exposition of what should be done. I had to infer a procedure which
>> would make sense of the argument that a significance test should have
>> been done.
>>
>> NOW, however, you proceed to explicitly say exactly what you claim not
>> to be saying:
>>
>RU> >
>> >I know that I was explicit in saying otherwise. I said something
>> >like, If your data aren't good enough so you can quantify this mean
>> >difference with a t-test, you probably should not be offering means as
>> >evidence.
>
> - This is a point that you are still missing.
>I am considering the data... then rejecting the *data* as lousy.
>
>I'm *not* drawing substantial conclusions (about the original
>hypotheses) from the computed t, or going ahead with further tests.
>
>NR>
>> In other words, if you can't reject the null hypothesis that the
>> performance of male and female faculty does not differ in some
>> population from which the actual faculty were supposedly drawn, then
>> you should ignore the difference in performance seen with the actual
>> faculty, even though this difference would - by standard statistical
>> methodology explained in any elementary statistics book - result in a
>> higher standard error for the estimate of the gender effect, possibly
>> undermining the claim of discrimination.
>
>- Hey, I'm willing to use the honest standard error. When I have
>decent numbers to compare. But when the numbers are not *worthy*
>of computing a mean, then I resist comparing means.
>
>RU> >
>> > And, Many of us statisticians find tests to be useful,
>> >even when they are not wholly valid.
>>
>NR>
>> It is NOT standard statistical methodology to test the significance of
>> correlations between predictors in a regression setting, and to then
>> pretend that these correlations are zero if you can't reject the null.
>
> - again, I don't know where you get this.
>
>Besides, on these data, "We reject the null..." once JS finally did
>a t-test. But it was barely 5%.
>
>And now I complain that there is a huge gap. It is hard to pretend
>that these numbers were generated as small, independent effects that
>are added up to give a distribution that is approximately normal.
>
>[ snip, some ]
>RN>
>> So the bigger the performance differences, the less attention should
>> be paid to them? Strange...
>>
>Yep, strange but true.
>
>They would be more convincing if the gap were not there.
>
>The t-tests (Students/ Satterthwaite) give p-values of .044 and .048
>for the comparison of raw average values, 7032 versus 1529.
>If we subtract off 5000 from each of the 3 large counts (over 10,000),
>the t-tests have p-values of .037 and .036, comparing 4532 versus
>1529.
>
>Subtract 7000 for the three, p-values are hardly different, at .043,
>.040; comparing counts of 3532 versus 1539.
>
>In my opinion, this final difference rates (perhaps) higher on the
>scale of "huge differences" than the first one: the t-tests are
>about equal, but the actual numbers (in the second set) don't confirm
>any suspicions about a bad distribution. The first set is bad enough
>that "averages" are not very meaningful.
>
>
>
>http://www.pitt.edu/~wpilib/index.html
=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
http://jse.stat.ncsu.edu/
=================================================================