On Tue, 27 Feb 2001 07:49:23 GMT, [EMAIL PROTECTED] (Irving
Scheffe) wrote:

My comments are written as responses to the technical 
comments to Jim Steiger's last post.  This is shorter than his post,
since I omit redundancy and mostly ignore his 'venting.'
I think I offer a little different perspective on my previous posts. 

[ snip, intro. ]

JS>
> You are the one who examined nonrandom data, representing citation
> counts over a 12 year period for senior male and female MIT biologists
> matched for year of Ph.D.  You look at these data, which
> show a HUGE difference in performance between the men and women,
> and declare that a significance test is necessary. But you
> cannot provide any mathematical justification for the test.

> I gave several examples to try to jar you into realizing that
> a statistical test on the data cannot answer the question you
> want answered.

To start with, I never examined any *data*.  I kept away from
the papers because I knew so little about the data and it looked
so messy; I made some comments about how difficult it could be.

I tossed in a couple of comments to encourage Gene G., who made
some good sense, as did Dennis.  As I read it, you proceeded to 
browbeat them, while failing to respond to their substance.
I have tried to make sense of that early part of *your* argument,
where you want to leap over their critiques.

You claim a HUGE difference.  You say you assert this because of 
exquisite sensitivity to numbers.  Dennis challenged this on the
basis of "lousy standards" -- either by their metric or content -- 
and Gene challenged this as misleading, because it was "not 
(nominally) significant."  They disagreed with you on the inference
that you drew from two means.

I agree that a huge difference may be useful.  I agree that t-tests
don't offer any final resolution.  (As I posted before,) with
nonrandom data, we have to argue contingencies, explore options, 
and make what inferences that we can.  You seem to cut that short, 
chop! -- pronouncing your own verdict as final -- but I don't see how 
hammering your own gavel can convince people who have the
choice of looking elsewhere.

You may think that you speaking from unimpeachable epiphany; 
to the rest of us, it looks like you are jumping to a conclusion.

You offer your  *inference* that a huge citation difference explains 
the outcome.  Okay, that could be reasonable.  If the difference 
is direct but attenuated, the "difference" between citations would
be larger, by variance (by some measure)-accounted for, than the 
difference between outcomes: which, I think, we stipulate has some 
size to it.

If those measurements are on a reasonably useful metric, then a 
t-test should show it.  It is my own experience, and part of my
own learned, "exquisite" sensitivity to numbers, that
 (1) a mean difference as large as you illustrated should result 
in a t-test that is significant, unless there is something screwy 
with the numbers. 
 (2) And if there is something so screwy with the numbers, then
it is usually misleading and wrong to present the MEANS as if
their contrast was meaningful ("huge").

Now, there is not a "mathematical necessity" for a test statistic.
It is a request that you respect the conventions of statisticians,
even when we ask for a test on non-random data, for what we might 
learn from it.  Non-significant tests, which I had thought the
data were producing, really undermine your adjective "huge".
A "significant" test, which you now report, lends some credibility.
Gene's permutation test says that those sets are not disjoint, 
however, so there is some basis for direct comparison.  The
most extreme permutation would have undercut  *one* form of
comparison, and the most obvious part of one argument about
discrimination (though, I expect, not everything).  It *looks*
like you didn't want to consider a comparison because you 
figured you could win the argument by repetition and ferocity.

Your first excuse for not computing a t was that this was a 
"population" but that was flat wrong.  I asked for your textbook
references, and finally offered my own, in order to figure out
your context.  I offered "nonrandom", which you used, above.
However, the old arguments about not computing "test statistics" 
on nonrandom samples have hardly any force these days -- I 
offer epidemiology as the pervasive (and persuasive) influence.

Epidemiologists need to be reminded about the limits to their
inference, - they tend to forget it entirely - but I think you 
are standing alone if you refuse to compute, claiming that old
principle.  I don't know if your role is such that someone will 
*have*  to answer you, or if you are fated to wind up 
ignored (as non-responsive and therefore irrelevant); 
and angry.

[snip, my comment to which JS wrote:]
> Not so. If you were following the logic of the many examples I've
> presented, you could see that you can construct a reductio ad
> absurdem for any of the types of significance tests you are
> proposing. If I believed strictly in hypothesis testing 

Ah, yes, the examples.  I'm sorry, but these still seem apart
from *my*  points.  These seemed designed to avoid discussion
of the meaning of the numbers.  If there is "significance," or 
not, the burden is shifted to one argument or the other.  The
first example featured 3+3  non-overlapping numbers near 90, 
as "productivity" of some sort. I said, those seemed like
screwy numbers, justifying a closer look at the numbers.  
Another had two disjoint sets of numbers, from which one might 
conclude, if anything, "These are disjoint sets, whatever 
they mean....  It probably justifies a closer look at the 
numbers."  You, on the other hand, want to pronounce a 
difference as "huge"  and shut down any discussion, contrary 
to what I conclude from the examples.

I have seen the MIT numbers, now, in your post.  The top three
men are an impressive set.  I have also read the pages at the URLs 
posted by Dennis, and I have grasped that  10,000 cites are
*conceivably*  the result of a single, joint authorship of some
basic methodology that everyone must cite.  Furthermore, I wonder
if HERE, that might be the case.  Those top three numbers leave 
a big gap to the rest.  Like the gap in the Example, the irregularity
causes me to wonder what at the numbers -- curiosity you seem to
discourage.  I'll just add:  If "number of citations"   *always*
rules with promotions committees/ deans, then those folks would be
incompetent asses.


[ snip, about CI; redundancy. venting ]

JS>
> What "outlier" are you referring to? What statistical rule did you 
> use to determine the "outlier"?
[ snip; MIT paper had data; t-test *is* signficant at .05 ]

 - Sorry, my mistake.  As I said several times before, the
"huge" nominal difference in means combined with a failed test 
confirmed a bad distribution.  I expected an outlier or two.
Now, you give me three - that's a different oddity, and calls
for a different sort of explanation.

You go on to say, you did not state you were doing inference 
on means.  I say, No, you just went ahead and did it, claiming NOT.
And then you shouted down the demurrals.  If you could point to a
rule that fixed "conditions"  to "citations"-according-to-that
measurement, then you'd start with an administrative problem.
But that would be converted to "inference" as soon as someone 
demanded justifications.


I include the data, just so it will be reproduced here.

> Here are the raw data for the citation counts for the 5 senior
> MIT female biologists and 6 males who graduated from 1970-76.
>
>    Males    Females
> ----------------------
>    12830    2719
>    11313    1690
>    10628    1301
>     4396    1051
>     2133     935
>      893
> -----------------------
>
> These data are based on 12 years worth of records, from 1989-2000.
> The above could be broken down in numerous other ways. For example, we
> could produce citation counts per year, try to perform some kind of
> correction for the highly specific areas the individuals publish in,
> etc. Time series could be examined. 

Ahem.  These alternatives seem stunningly dull, compared to 
what you should have heard about if you were paying attention. 
First, "First authorship" vs "Other."  Factor out: contribution of a
single paper. Time series may be useful, if you mean, data for 
each published piece.  

> However, these data are anything but a random sample. MIT is one of
> the most selective universities in the world in terms of whom it
> hires. 

 - another possible bias: What was written before hire by MIT?
 - another consideration: These all look pretty good, so,
What is "sufficient"?  What level of treatment is *owed*, at a 
minimum  to faculty who are the most select in the world? 
How much does it change if they are wooed by other universities, 
in the way that Nobel winners are, or that women-scientists 
have been lately?

Can a school like MIT keep some of its faculty on board at 
low wages, etc., because of prestige? (probably).
Do they? (I have no idea).  Should they? (not often).


[ snip, details about superior records. ]
> Remembering that these data were gathered over a 12 year span,
> and that they are designed ONLY to answer the question, "Is
> it possible that the higher departmental perks and status
> of these males is due to performance," I think your commentary
> was ludicrous, Rich.

Remembering that I never saw these data before and that I was
pointedly picking the frayed holes of your *argument*,  that 
judgment is impertinent.  If these data were carefully designed, 
I should expect more qualification and justification to them; 
aren't they a crude number?  - Perhaps I miss something by not 
reading the papers, but, if so, you should have pointed Gene and
Dennis politely to the details, instead of blundering around and 
making it appear that "this one is huge"  is your whole basis.
My commentary is devoted to your presentation, here.

[ snip, "importance of issue" and more redundancy.]

Hope that helps.
-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to