Re: 2x2 tables in epi. Why Fisher test?

2001-05-10 Thread Herman Rubin

In article 9deiug$l0h$[EMAIL PROTECTED],
Ronald Bloom  [EMAIL PROTECTED] wrote:

Significance tests for 2x2 tables require that the single observed
table be regarded as if it were, (under the null hypothesis of 
uniformity or independence) but a single instance drawn at  
random from a universe of replicates.  Insofar as there are at 
least three well-known distinct such sample spaces that one
might arguably propose as reasonable models of the universe
of replicates, different probability models by which the 
extremity of the observed table, under the null hypothesis,
do arise.

I can even provide more.  But from the standpoint of
classical statistics, it makes little difference.  From
the standpoint of decision theory it does, but then one
would not be doing anything like fixing a significance
level in the first place.

 This has given rise over the years to misunderstandings
between proponents of different small-sample inferential tests
of signifance for 2x2 tables.  But the disputes seem largely to
be due to the failure of the disputants to identify precisely
that particular probability setup which is correct for the
particular problem at hand.  

At least three distinct such ways of regarding a given 2x2 table can 
be distinguished:

1.) both row and column marginals regarded as fixed, and under
the null hypothesis of uniformity, the observed table is treated 
as a random sample from the finite set of permutations of all
2x2 tables satisfying that constraint.  This sample-space model
gives rise to the hypergeometric distribution for the 
probability of the observed table; thus the Fisher Exact test.

The advantage of this one is that an exact test of the
prescribed level can be produced.

2.) The two row (col)marginals are treated as independent; and the
observed table under the null hypothesis is regarded as 
being the result of two independent random samples from 
identical binomial distributions.  The significance test used
in this case is identical to the elementary test for the
difference between two sample proportions.

This is a much more complicated testing situation than you
seem to think.  Because of the nuisance parameters, it is
essentially impossible to come up with a natural test
at the precise level, especially for small samples.

3.) Only the total cell sum T is regarded as fixed.  The 
observed table, under the null hypothesis, is regarded as a 
random draw of four cell values satisfying the constraint
that their total T is specified.  This leads to a 
multinomial distribution.  

Each one of these probability setups 1-3 gives rise to a somewhat
different small-sample inferential test.  In particular, 
the schemes (1),(2),(3) give rise to distributions conditioned
on 3, 2, and 1 fixed parameters respectively.

But these parameters are unknown.  Testing with nuisance
parameters is very definitely not easy, and exact tests
are hard to come by.  Even in other types of problems,
conditional tests are often used.  In fact, in many
practical problems, the sample size itself need not be
fixed.  It is not uncommon to use the number of
observations as if it were a fixed sample size, and it is
easy to give examples where this can be shown not to do
what is wanted.

 Since, for
large cell values, the large-sample approximations to all
of these distributions (apparently?) converge to the
CHi-Squared distribution, it is only in situations with
small cell sizes that the controversy over choice of
probability model is of practical (?) import.

As long as the conditional probabilities are the same,
and one uses one of the scenarios you mentioned, the
distribution of the Fisher exact test given the marginals
is as stated.  Thus the probability that the test at a
given level rejects is precisely the stated level in all
of these cases, assuming that randomized testing is used.

If one uses a decision approach, none of this is correct,
even if the Fisher model happens to be true.
-- 
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
[EMAIL PROTECTED] Phone: (765)494-6054   FAX: (765)494-0558


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: 2x2 tables in epi. Why Fisher test?

2001-05-10 Thread Ronald Bloom

In sci.stat.edu Herman Rubin [EMAIL PROTECTED] wrote:

Each one of these probability setups 1-3 gives rise to a somewhat
different small-sample inferential test.  In particular, 
the schemes (1),(2),(3) give rise to distributions conditioned
on 3, 2, and 1 fixed parameters respectively.

 But these parameters are unknown.  Testing with nuisance
 parameters is very definitely not easy, and exact tests
 are hard to come by.  Even in other types of problems,

  I was not here referring to the unknown nuisance parameter
(namely the unknown binomial probability).  In schemes
(1), (2), (3) the 3, 2, and 1 fixed conditioning parameters
are, respectively:  (a) two row marginals and one column
marginal  (b) two independent row marginals  (c) the 
total sum of four cells.   In the conditioning arguments
which yield the signficance tests I alluded to above, 
those  3, 2, or 1 parameters are *known*.  


 As long as the conditional probabilities are the same,

   which conditional probabilities are you referring to?

 and one uses one of the scenarios you mentioned, the
 distribution of the Fisher exact test given the marginals
 is as stated.  Thus the probability that the test at a
 given level rejects is precisely the stated level in all
 of these cases, assuming that randomized testing is used.


 If one uses a decision approach, none of this is correct,
 even if the Fisher model happens to be true.


  I was only addressing the matter   of the logical relationship
between the probability model used in the significance test to  
the implied underlying sample space of 2x2 tables from which
the observed table was drawn.  It seems to me that the
choice of experimental design has some bearing on the choice
of such universe and I was wondering why the Fisher Universe
of permutations with 4 fixed marginals is chosen as the
basis for inferential tests for experimental setups in which
quite plainly only *two* marginals can be regarded as fixed 
(e.g. case-control studies,  so on).  Is there a simple
answer to this question?  (I guess there really is not...)

 -- 
 This address is for information only.  I do not claim that these views
 are those of the Statistics Department or of Purdue University.
 Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
 [EMAIL PROTECTED] Phone: (765)494-6054   FAX: (765)494-0558


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Question

2001-05-10 Thread Magill, Brett

A colleague has a data set with a structure like the one below:

ID  X1  X2  Y
1   1   0.700.40
2   1   0.800.40
3   1   0.650.40
4   2   1.200.25
5   2   1.100.25
6   3   0.900.30
7   4   0.500.50
8   4   0.600.50
9   4   0.700.50

Where X1 is the organization.  X2 is the percent of market salary an
employee within the organization is paid--i.e. ID 1 makes 70% of the market
salary for their position and the local economy.  And Y is the annual
overall turnover rate in the organization, so it is constant across
individuals within the organization.  There are different numbers of
employee salaries measured within each organization. The goal is to assess
the relationship between employee salary (as percent of market salary for
their position and location) and overall organizational turnover rates.

How should these data be analyzed?  The difficulty is that the data are
cross level.  Not the traditional multi-level model however.  That there is
no variance across individuals within an organization on the outcome is
problematic.  Of course, so is aggregating the individual results.  How can
this be modeled both preserving the fact that there is variance within
organizations and between organizations.  I suggested that this was a
repeated measures problem, with repeated measurements within the
organization, my colleague argued it was not. Can this be modeled
appropriately with traditional regression models at the individual level?
That is, ignoring X1 and regressing Y ~ X2.  It seems to me that this
violates the assumption of independence.  Certainly, the percent of market
salary that an employee is paid is correlated between employees within an
organization (taking into account things like tenure, previous experience,
etc.).

Thanks


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: (none)

2001-05-10 Thread Rich Ulrich


 - selecting from CH's article, and re-formatting.  I don't know if 
I am agreeing, disagreeing, or just rambling on.

On 4 May 2001 10:15:23 -0700, [EMAIL PROTECTED] (Carl Huberty)
wrote:

CH:   Why do articles appear in print when study methods, analyses,
results, and conclusions are somewhat faulty?

 - I suspect it might be a consequence of Sturgeon's Law, 
named after the science fiction author.  Ninety percent of 
everything is crap.  Why do they appear in print when they
are GROSSLY faulty?  Yesterday's NY Times carried a 
report on how the WORST schools have improved 
more than the schools that were only BAD.  That was much-
discussed, if not published.  - One critique was, the 
absence of peer review.  There are comments from statisticians
in the NY Times article; they criticize, but (I thought) they 
don't get it  on the simplest point.

The article, while expressing skepticism by numerous 
people, never mentions REGRESSION TOWARD the MEAN
which did seem (to me) to account for every single claim of the
original authors whose writing caused the article.


CH:  []  My first, and perhaps overly critical, response  is that
the editorial practices are faulty[ ... ] I can think of two
reasons: 1) journal editors can not or do not send manuscripts to
reviewers with statistical analysis expertise; and 2) manuscript
originators do not regularly seek methodologists as co-authors.  
Which is more prevalent?

APA Journals have started trying for both, I think.  But I think
that statistics only scratches the surface.  A lot of what arises
are issues of design.  And then there are issues of data analysis.

Becoming a statistician helped me understand those so that I could
articulate them for other people;  but a lot of what I know was never
important in any courses.  I remember taking just one course or
epidemiology, where we students were responsible for reading and
interpreting some published report, for the edification of the whole
class -- I thought I did mine pretty well, but the rest of the class
really did stagger through the exercise.  

Is this critical reading  something that can be learned, and
improved?

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: 2x2 tables in epi. Why Fisher test?

2001-05-10 Thread Rich Ulrich


 - I offer a suggestion of a reference.

On 10 May 2001 17:25:36 GMT, Ronald Bloom [EMAIL PROTECTED] wrote:

[ snip, much detail ] 
 It has become the custom, in epidemiological reports
 to use always the hypergeometric inference test --
 The Fisher Exact Test -- when treating 2x2 tables 
 arising from all manner of experimental setups -- e.g.
 
 a.) the prospective study
 b.) the cross-sectional study
 3.) the retrospective (or case-control) study
  [ ... ]

I don't know what you are reading, to conclude that this
has become the custom.   Is that a standard for some
journals, now?

I would have thought that the Logistic formulation was
what was winning out, if anything.

My stats-FAQ  has mention of the discussion published in
JRSS (Series B)  in the1980s.  Several statisticians gave 
ambivalent support to Fisher's test.  Yates argued the logic
of the exact test, and he further recommended the  X2 test
computed with his (1935) adjustment factor, as a very accurate 
estimator of Fisher's p-levels.

I suppose that people who hate naked p-levels will have to 
hate Fisher's Exact test, since that is all it gives you.

I like the conventional chisquared test for the 2x2, computed
without Yates's correction --  for pragmatic reasons.  Pragmatically,
it produces a good imitation of what you describe, a randomization
with a fixed N but not fixed margins.  That is ironic, as Yates
points out (cited above) because the test assumes fixed margins
when you derive it.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: 2x2 tables in epi. Why Fisher test?

2001-05-10 Thread Elliot Cramer

In sci.stat.consult Ronald Bloom [EMAIL PROTECTED] wrote:
Herman as usual is absolutely correct; the validity of the Fisher test is
analagous to the validity of regression tests which are derived
conditional on x but, since the distribution does not involve x, are valid
unconditionally even if the x's are random.


Incidentally, if one randomizes to get an exact p value, the Fisher test
is uniformly most powerful. Herman can tell us if this is for all three
cases.


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Question

2001-05-10 Thread dennis roberts

this is not unlike having scores for students in a class ... one score for 
each student and ... the age of the teacher of THOSE students ... for a 
class ... scores will vary but, age for the teacher remains the same ... 
but the age might be different in ANother class with a different teacher 
... in a sense, the age is like a mean  just like your turnover rate ... 
and you want to know the relationship between student scores and teachers ages

something has to give

i think you have to reduce the data points on X2 ... find the mean within 
organization 1 ... on X2 ... then have .4 next to it ... second data pair 
would be mean on X2 for organization 2 .. with .25 ... etc.

so, in this case ... you have 4 values on X2 and 4 values on Y ... so, what 
is the relationship between those??

look at the following:


  Row C7 C8

1   0.72   0.40
2   1.15   0.25
3   0.90   0.30
4   0.60   0.50

MTB  plot c8 c7

Plot


  - *
  0.48+
  -
  C8  -
  - *
  -
  0.36+
  -
  -   *
  -
  -
  0.24+*
+-+-+-+-+-+--C7
 0.60  0.70  0.80  0.90  1.00  1.10
Correlations: C7, C8


Pearson correlation of C7 and C8 = -0.957
P-Value = 0.043

there might be a better way to do it but ... looks like a pretty clear case 
of the greater the % of market the organization pays ... the less is there 
turnover rate


At 06:05 PM 5/10/01 -0400, Magill, Brett wrote:
A colleague has a data set with a structure like the one below:

ID  X1  X2  Y
1   1   0.700.40
2   1   0.800.40
3   1   0.650.40
4   2   1.200.25
5   2   1.100.25
6   3   0.900.30
7   4   0.500.50
8   4   0.600.50
9   4   0.700.50

Where X1 is the organization.  X2 is the percent of market salary an
employee within the organization is paid--i.e. ID 1 makes 70% of the market
salary for their position and the local economy.  And Y is the annual
overall turnover rate in the organization, so it is constant across
individuals within the organization.  There are different numbers of
employee salaries measured within each organization. The goal is to assess
the relationship between employee salary (as percent of market salary for
their position and location) and overall organizational turnover rates.



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: 2x2 tables in epi. Why Fisher test?

2001-05-10 Thread David Duffy

In sci.stat.edu Ronald Bloom [EMAIL PROTECTED] wrote:

 It has become the custom, in epidemiological reports
 to use always the hypergeometric inference test --
 The Fisher Exact Test -- when treating 2x2 tables 
 arising from all manner of experimental setups -- e.g.

Only for tables with small cell sizes (and for combination of multiple
such tables), and only because software is freely available.  I would
have thought it is more likely to be seen used for large sparse 2xK tables,
eg HLA literature. Its shortcomings (conservative under the other setups)
are also well known (I hope!).

David Duffy.


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Q: statistical techniques for series of events

2001-05-10 Thread Mark W. Humphries

I have a sample set of series of state-changes/events/behaviors, from this
sample I'd like to generalize a scoring method for the likelihood of a
criterion behavior on other data sets.
Could someone guide me to the appropriate statistical technique for this
type of problem and any useful resources.

Thanks in advance,
 Mark



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: (none)

2001-05-10 Thread EugeneGall

Subject: Re: (none)
From: Rich Ulrich [EMAIL PROTECTED] 
Date: 5/10/2001 5:15 PM Eastern 
Snip?
CH:   Why do articles appear in print when study methods, analyses,
results, and conclusions are somewhat faulty?

 - I suspect it might be a consequence of Sturgeon's Law, 
named after the science fiction author.  Ninety percent of 
everything is crap.  Why do they appear in print when they
are GROSSLY faulty?  Yesterday's NY Times carried a 
report on how the WORST schools have improved 
more than the schools that were only BAD.  That was much-
discussed, if not published.  - One critique was, the 
absence of peer review.  There are comments from statisticians
in the NY Times article; they criticize, but (I thought) they 
don't get it  on the simplest point.

The article, while expressing skepticism by numerous 
people, never mentions REGRESSION TOWARD the MEAN
which did seem (to me) to account for every single claim of the
original authors whose writing caused the article.
Snip
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


The link to the NY Times story, Rich cites is below.  The design of this study
certainly appears to be a candidate for the regression fallacy.  After vouchers
were introduced in FL, the failing schools improved faster than the almost
failing schools:  On the eve of Congressional debate over President Bush's
plan to give students at low-performing schools federal money for private
school tuition vouchers, Dr. Greene announced that Mr. Bush's proposal would
work as well. ...That's not a theory, Dr. Greene stated, but proven
fact [Dr. Greene] showed that after failing one time, higher-scoring F
schools posted greater gains than lower-scoring D schools. Because these
schools were otherwise alike, Dr. Greene stated that a threat of vouchers must
have made F schools improve more rapidly. 

Regression to the mean can be difficult to control, but in this case there was
an internal control.  In a reanalysis of the Florida school test data, Harris
found that the greater improvement of the worst schools between grading periods
was just as great during the pre-voucher period:  Dr. Harris found that before
1999, higher-scoring schools in the failing group also gained more than
lower-scoring schools in the next group. The subsequent voucher policy
apparently had no added effect.

http://www.nytimes.com/2001/05/09/national/09LESS.html?searchpv=site01

I wonder if regression to the mean will make it into the Congressional debate
of the education bill in the coming weeks.


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Question

2001-05-10 Thread Donald Burrill

On Thu, 10 May 2001, Magill, Brett wrote, inter alia:

 How should these data be analyzed?  The difficulty is that the data 
 are cross level.  Not the traditional multi-level model however.  

Hi, Brett.  I don't understand this statement.  Looks to me like an 
obvious place to apply multilevel (aka hierarchical) modelling.  
(Have you read Harvey Goldstein's text on the method?)  You have persons 
within organizations (just as, in educational applications of ML models, 
one has pupils within schools for a two-level model, and pupils within 
schools within districts for a three-level model), and apparently want to 
carry out some estimation or other analysis while taking into account the 
(possible) covariances between levels.
If you want a simpler method than ML modelling, the method Dennis 
proposed at least lets you see some aggregate effects.  (This does, 
however, put me in mind of a paper of (I think) Brian Joiner's whose 
temporary working title was To aggregate is to aggravate -- though it 
was published under another title.)  ;-)
Along the lines of Dennis' suggestion, you could plot Y vs X2 
(or X2 vs Y) directly, which would give you the visual effect Dennis 
showed while at the same time showing the scatter in the X2 dimension 
around the organization average.  For larger data sets with more 
organizations in them (so that perhaps several organizations would have 
the same (or at any rate indistinguishable, at the resolution of the 
plotting device used) turnover rate), you could generate a letter-plot 
(MINITAB command:  LPLOT), using the organization ID in X1 as a labelling 
variable.

Brett's original post presented this data structure:

 A colleague has a data set with a structure like the one below:
 
 IDX1  X2  Y
 1 1   0.700.40
 2 1   0.800.40
 3 1   0.650.40
 4 2   1.200.25
 5 2   1.100.25
 6 3   0.900.30
 7 4   0.500.50
 8 4   0.600.50
 9 4   0.700.50
 
 Where X1 is the organization.  X2 is the percent of market salary an
 employee within the organization is paid -- i.e. ID 1 makes 70% of the 
 market salary for their position and the local economy.  And Y is the 
 annual overall turnover rate in the organization, so it is constant 
 across individuals within the organization.  There are different 
 numbers of employee salaries measured within each organization.  The 
 goal is to assess the relationship between employee salary (as percent 
 of market salary for their position and location) and overall 
 organizational turnover rates.

 How should these data be analyzed?  The difficulty is that the data are 
 cross level.  Not the traditional multi-level model however.  That 
 there is no variance across individuals within an organization on the 
 outcome is problematic.  Of course, so is aggregating the individual 
 results.  How can this be modeled both preserving the fact that there is 
 variance within organizations and between organizations?

As I understand it (as implied above), this is exactly the kind of 
structure for which multilevel methods were invented.

 I suggested that this was a repeated measures problem, with repeated 
 measurements within the organization, my colleague argued it was not. 

This strikes me as a possible approach (repeated measures can be treated 
as a special case of multilevel modelling).  But most software that I 
know of that would handle repeated-measures ANOVA would tend to insist 
that there be equal numbers of levels of the repeated-measures factor 
throughout the design, and this appears not to be the case (your sample 
data, at any rate, have different numbers of individuals in the several 
organizations).

 Can this be modeled appropriately with traditional regression models at 
 the individual level?  That is, ignoring X1 and regressing Y ~ X2. 

That was, after a fashion, what Dennis illustrated.  In a formal 
regression analysis, I should think it unnecessary to ignore X1;  
although it would doubtless be necessary to recode it into a series of 
indicator-variable dichotomies, ot something equivalent.

 It seems to me that this violates the assumption of independence. 

Not altogether clear.  By this do you mean regression analysis?  
Or, perhaps, the particular analysis you suggested, ignoring X1?  Or...? 
And what assumption of independence are you referring to?  (At any 
rate, what such assumption that would not be violated in other formal 
analyses, e.g. repeated-measures ANOVA?)

 Certainly, the percent of market salary that an employee is paid is 
 correlated between employees within an organization (taking into 
 account things like tenure, previous experience, etc.).

Well, would the desired model take such things into account? 
(If not, why not?  If so, where is the problem that I rather vaguely 
sense lurking between the lines here?)
-- Don.