Re: Hypothesis testing and magic - episode 2

2000-04-20 Thread P.G.Hamer

Jerry Dallal wrote:

 As Tukey has pointed out, the null hypothesis of no effect
 is not that we think there is no effect, but we are uncertain
 of the direction.

 I wish I knew more about Delany and its application.
 One problem, pointed out by David Salsburg, is that a
 substances that eliminates one of many competing risks
 would appear to increase the other risks.
 For example, people no longer subject to heart disease
 would undoubtedly see an increased incidence of cancer, with all
 cause mortality holding steady at 100%.

I would hope that such risks would be measured as probability per unit
time, and so the  first-order effects of `we all die' would be removed.
Which still leaves the second-order effects due to the lengthy induction

process of many cancers.

BTW an even greater problem in animal testing seems to be due using
feed-on-demand systems. The little critters are usually bored out of
their minds and overeat, causing a variety of health problems. So any
drug that makes them mildly unwell can easily spoil their appetite --
and make them look healthier.

Peter







===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: Hypothesis testing and magic - episode 2

2000-04-20 Thread Rich Ulrich

On Thu, 20 Apr 2000 10:48:38 +0100, "P.G.Hamer"
[EMAIL PROTECTED] wrote:
  snip, interesting stuff about, proper age-adjusted life-tables,
with proper adjustment of base-line Ns, would not show an increase in
competing causes of death 
 
 BTW an even greater problem in animal testing seems to be due using
 feed-on-demand systems. The little critters are usually bored out of
 their minds and overeat, causing a variety of health problems. So any
 drug that makes them mildly unwell can easily spoil their appetite --
 and make them look healthier.

I never knew that!  

But that might be similar, or that might underlie another thing that I
once was told about laboratory rats.

I had been impressed by the newspaper reports that rats lived longer
if they were underfed, i.e., on very-low-calorie diets.  Then my
lab-tech friends told me that the lab rats tended to live to a certain
*size* rather than age.  The starved ones took 30% longer to reach
that same size.  So my friends were not at all impressed by those news
reports.  [ There may be newer data that are more impressive.]

I later realized that humans and dogs are in the minority among
mammals, in that we achieve "adult" size and then stop growing.  For
elephants and moose and bears, etc., the stereotype from childhood
nature stories is not all invention.  If the  clever "old man of the
woods/jungle/forest" is the wisest and the oldest, he is likely to be
the biggest, because most critters never stop growing.  That seemed to
tie in to the rat-life-spans, too.

-- 
Rich Ulrich, [EMAIL PROTECTED] 
http://www.pitt.edu/~wpilib/index.html


===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: Hypothesis testing and magic - episode 2

2000-04-18 Thread Jerry Dallal

Herman Rubin wrote:
 
 The truth myth is highly persistent.  We have the Delaney
 Clause, which requires the FDA to ban any additive "which
 has been found to cause cancer in humans or animals".
 Now what does this mean?  It is unlikely that anything
 does not affect the cancer rate.
 
 We do not have the truth, and will not get it.  That
 point null hypothesis is false.  So we need to get off
 the tack that we want to accept if it is true, and
 reject if it is false.

As Tukey has pointed out, the null hypothesis of no effect
is not that we think there is no effect, but we are uncertain
of the direction.  

I wish I knew more about Delany and its application.  
One problem, pointed out by David Salsburg, is that a
substances that eliminates one of many competing risks 
would appear to increase the other risks.
For example, people no longer subject to heart disease 
would undoubtedly see an increased incidence of cancer, with all
cause mortality holding steady at 100%.


===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: Hypothesis testing and magic - episode 2

2000-04-13 Thread Michael Granaas

On Thu, 13 Apr 2000, Alan McLean wrote:

 Some more comments on hypothesis testing:
 
 My impression of the ‘hypothesis test controversy’, which seems to exist
 primarily in the areas of psychology, education and the like  is that it
 is at least partly a consequence of the sheer difficulty of carrying out
 quantitative research in those fields. A root of the problem seems to be
 definitional. I am referring here to the definition of the variables
 involved.
 
 In, say, an agricultural research problem it is usually easy enough to
 define the variables. For a very simple example, if one is interested in

In addition to defining the variables some areas do a better job of
defining and therefore testing their models.  The ag example is one where
not only the variables are relatively clear so are the models.  That is
there is one highly plausible reason for rejecting a null that fertilizer
does not effect crop production:  Fertilizer increases crop production.
You have rejected a model of no effect in favor of a model positing an
effect.

But in some areas in psychology you will have a situation where many
theoretical perspectives predict the same outcome relative to a zero
valued null while the zero valued null reflects no theoretical
perspective.  In this situation rejecting a zero valued null supports all
theoretical perspectives equally and differentiates among none of them.

In a recent example a student was citing the research literature
supporting the convergent validity of some measure.  The evidence used by
all investigators was that the null of rho = 0 was rejected.  I've seen
this same thing many times, but this time I saw something different.  The
smallest sample (n about 95) failed to reject rho = 0 while the remaining
samples (all n's  200) successfully rejected rho = 0 and convergent
validity was declared.  (No r's were actually reported in this review.)

A quick thought experiment, and check of critical value tables, suggests
that the best estimate of rho from the evidence provided is  some value
greater than 0 but less than .20.  

In this case it seems to me that testing the default zero valued null was
misleading rather than informative.  In addition to convergent validity it
seems to me that correlations in the range 0 - .20 could easily be
explained by at least a couple of other competing models that would not
support the conclusions drawn.  Only the most trivial link between
theoretical models and statistical hypotheses exist in this case.

Using Alan's ethnicity and statistical ability example, and assuming for
the moment that all measures were useful, the first time we reject a no
effect null we have some sort of useful information.  Now, imagine that 12
researchers generate 12 different hypotheses explaining the cause of these
differences.  Current practice has all 12 of these researchers collecting
data and testing to eliminate the chance model and then declaring that
their hypothesis  has been confirmed.

I agree that measurement is a problem, but even with good measurement the
lack of connection between statistical hypotheses and theoretical
predictions is a fatal flaw in too many areas.

Michael

 
 Regards again,
 Alan
 
 
 --
 Alan McLean ([EMAIL PROTECTED])
 Department of Econometrics and Business Statistics
 Monash University, Caulfield Campus, Melbourne
 Tel:  +61 03 9903 2102Fax: +61 03 9903 2007
 
 
 
 
 ===
 This list is open to everyone.  Occasionally, less thoughtful
 people send inappropriate messages.  Please DO NOT COMPLAIN TO
 THE POSTMASTER about these messages because the postmaster has no
 way of controlling them, and excessive complaints will result in
 termination of the list.
 
 For information about this list, including information about the
 problem of inappropriate messages and information about how to
 unsubscribe, please see the web page at
 http://jse.stat.ncsu.edu/
 ===
 

***
Michael M. Granaas
Associate Professor[EMAIL PROTECTED]
Department of Psychology
University of South Dakota Phone: (605) 677-5295
Vermillion, SD  57069  FAX:   (605) 677-6604
***
All views expressed are those of the author and do not necessarily
reflect those of the University of South Dakota, or the South
Dakota Board of Regents.



===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of 

Re: Hypothesis testing and magic - episode 2

2000-04-13 Thread dennis roberts

At 08:37 AM 4/13/00 -0400, Art Kendall wrote:
in the "harder to do" sciences it is common to distinguish an experiment 
from a
quasi-experiment.

Part of the difficulty of these fields is that we can not (or ethically may
not) manipulate many independent variables.  Therefore we lose the opportunity
to assert "et ceteris paribus" "everything else being equal" that is part of a
true experiment.

there goes medicine!

if this is a real distinction ... then, instead of having 'hard' and 'soft' 
sciences ... we should think of it as:

hard and soft investigations ...

but, if we follow this to some logical conclusion ... this could be 
rephrased as meaning ...

situations where you have essentially complete control over variable 
manipulation  = situations where you can establish 'the truth' (in 
terms of the impacts of these variables on things)  ... but, this is 
precisely what many have been arguing on the list about that hypothesis 
testing ... statistical significance testing that is ... is in NO position 
to help you assert 'the truth' ... truth is a metaphysical notion ... not 
statistical

in essence, if 'the truth' is a laudable goal and, for some reason we can 
'learn of it' through 'scientific investigation' ... then it is NOT 
significance testing that leads us to it ... ... rather it is the DESIGN of 
investigations that is the key ...







===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: Hypothesis testing and magic - episode 2

2000-04-13 Thread dennis roberts

At 10:23 AM 4/13/00 -0500, Michael Granaas wrote:

In addition to defining the variables some areas do a better job of
defining and therefore testing their models.  The ag example is one where
not only the variables are relatively clear so are the models.  That is
there is one highly plausible reason for rejecting a null that fertilizer
does not effect crop production:  Fertilizer increases crop production.
You have rejected a model of no effect in favor of a model positing an
effect.

i did not know that ag research ... in this case, production figures ... 
was so easily accomplished ...

it might be relatively easy to distribute fertilizer in different amounts 
... over plots ... but even there, there is considerable error ... check 
out the way our spreaders work on our lawns? and in addition ... every 
fertilizer i know of is a product that is an amalgamation of several 
subproducts ... and inert stuff too ... so the distribution of it over 
plots will not produce identical spreads ...

and ... how is production measured? to compare across plots means gathering 
in crops ... and making some kind of 'volume' measurements ... and that 
seems much easier said than done

now, i would not like to say that doing a fertilizer experiment has the 
same amount of 'error' as maybe one where we ask if different levels of 
intelligence impact differentially on problem success in later life ... but 
these differences are more a matter of degree ... than in one instance it 
is easy ... and in others it is not

maybe we should ask the ag researchers if THEY think doing their research 
is simple





===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: Hypothesis testing and magic - episode 2

2000-04-13 Thread Michael Granaas

On Thu, 13 Apr 2000, dennis roberts wrote:

 At 10:23 AM 4/13/00 -0500, Michael Granaas wrote:
 
 In addition to defining the variables some areas do a better job of
 defining and therefore testing their models.  The ag example is one where
 not only the variables are relatively clear so are the models.  That is
 there is one highly plausible reason for rejecting a null that fertilizer
 does not effect crop production:  Fertilizer increases crop production.
 You have rejected a model of no effect in favor of a model positing an
 effect.
 
 i did not know that ag research ... in this case, production figures ... 
 was so easily accomplished ...

I didn't say that there wasn't a lot of work involved.  What I said was
that there is a clear link between the experimental manipulation, the
outcome variable, the hypothesis test results, and the question asked.
This particular example was based on the type of question that Fisher
might have been dealing with circa 1925.  If my interpretation of
history is correct and Fisher's ag research was focused on treatment/no
treatment effects I think it helps us understand both the strength of his
method in that setting and identifies a potential weakness in our own use.

The methods of Fisher are useful when there is a strong link between the
substantive and statistical hypothesis.  The convergent validity example I
used, I thought, showed a weak link between the substantive question and
the statistical hypothesis (rho =? 0).  (I admit my ignorance to current
practice but I am pretty sure a correlation merely different from 0 is not
evidence that two measures are measuring the same thing.)  The weakness
of that link leaves the researcher with out any useful information from
their statistical decision. (i.e., knowing that the correlation is other
than zero does not establish convergent validity.)  Testing a null
hypothesis of, for example, rho ? .7 (there may be a more appropriate
value, I just picked this one because I like 7's and it illustrates my
point) would provide a much better match between the statistical decision
and the substantive question.  (Rejecting rho ? .7 would indicate a
degree of correlation consistent with convergent validity, failing to do
so would leave the question open.)

There are areas in psychology that have also done a good job of making the
links between their substantive and statistical hypotheses and seem to
have made a good deal of progress in knowledge generation.  There are
others that have not.  (I have heard roughly the same argument made for
physics, chemistry, and biology and I  expect that this is generally
true for a number of other research disciplines.)

With or without a link between substantive and statistical hypothesis I
acknowledge, nay, I proclaim, that research in any discipline is hard
work for all the reasons you suggest and more.  Any disrepect for the
field of agricultural research was unintended and I appologize to anyone
that I may have offended.

I also freely admit that ag research today is going to be very different
from the ag research of 1925.

Michael

***
Michael M. Granaas
Associate Professor[EMAIL PROTECTED]
Department of Psychology
University of South Dakota Phone: (605) 677-5295
Vermillion, SD  57069  FAX:   (605) 677-6604
***
All views expressed are those of the author and do not necessarily
reflect those of the University of South Dakota, or the South
Dakota Board of Regents.



===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: Hypothesis testing and magic - episode 2

2000-04-13 Thread Alan McLean

Hi Michael,

This sounds to me like lousy experimental design. Surely the purpose of the
experiment is to distinguish between competing theoretical models?

Michael Granaas wrote:

 But in some areas in psychology you will have a situation where many
 theoretical perspectives predict the same outcome relative to a zero
 valued null while the zero valued null reflects no theoretical
 perspective.  In this situation rejecting a zero valued null supports all
 theoretical perspectives equally and differentiates among none of them.

and I think that is what you are saying here.


 I agree that measurement is a problem, but even with good measurement the
 lack of connection between statistical hypotheses and theoretical
 predictions is a fatal flaw in too many areas.


Regards,
Alan


--
Alan McLean ([EMAIL PROTECTED])
Department of Econometrics and Business Statistics
Monash University, Caulfield Campus, Melbourne
Tel:  +61 03 9903 2102Fax: +61 03 9903 2007




===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: Hypothesis testing and magic - episode 2

2000-04-13 Thread Alan McLean

dennis roberts wrote:

 but, if we follow this to some logical conclusion ... this could be
 rephrased as meaning ...

 situations where you have essentially complete control over variable
 manipulation  = situations where you can establish 'the truth' (in
 terms of the impacts of these variables on things)  ... but, this is
 precisely what many have been arguing on the list about that hypothesis
 testing ... statistical significance testing that is ... is in NO position
 to help you assert 'the truth' ... truth is a metaphysical notion ... not
 statistical

 in essence, if 'the truth' is a laudable goal and, for some reason we can
 'learn of it' through 'scientific investigation' ... then it is NOT
 significance testing that leads us to it ... ... rather it is the DESIGN of
 investigations that is the key ...

Truth has nothing to do with it. We contruct stories of how the universe operates -
we call these stories 'theories' or 'models'. Significance testing is one way in
which we choose between stories as to which is (probably) more useful in a
specified context.

Alan


--
Alan McLean ([EMAIL PROTECTED])
Department of Econometrics and Business Statistics
Monash University, Caulfield Campus, Melbourne
Tel:  +61 03 9903 2102Fax: +61 03 9903 2007




===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: Hypothesis testing and magic - episode 2

2000-04-13 Thread David A. Heiser


- Original Message -
From: Michael Granaas [EMAIL PROTECTED]
To: EDSTAT list [EMAIL PROTECTED]
Sent: Thursday, April 13, 2000 8:23 AM
Subject: Re: Hypothesis testing and magic - episode 2
 In addition to defining the variables some areas do a better job of
 defining and therefore testing their models.  The ag example is one where
 not only the variables are relatively clear so are the models.  That is
 there is one highly plausible reason for rejecting a null that fertilizer
 does not effect crop production:  Fertilizer increases crop production.
 You have rejected a model of no effect in favor of a model positing an
 effect.

 But in some areas in psychology you will have a situation where many
 theoretical perspectives predict the same outcome relative to a zero
 valued null while the zero valued null reflects no theoretical
 perspective.  In this situation rejecting a zero valued null supports all
 theoretical perspectives equally and differentiates among none of them.

 In a recent example a student was citing the research literature
 supporting the convergent validity of some measure.  The evidence used by
 all investigators was that the null of rho = 0 was rejected.  I've seen
 this same thing many times, but this time I saw something different.  The
 smallest sample (n about 95) failed to reject rho = 0 while the remaining
 samples (all n's  200) successfully rejected rho = 0 and convergent
 validity was declared.  (No r's were actually reported in this review.)

 A quick thought experiment, and check of critical value tables, suggests
 that the best estimate of rho from the evidence provided is  some value
 greater than 0 but less than .20.

 In this case it seems to me that testing the default zero valued null was
 misleading rather than informative.  In addition to convergent validity it
 seems to me that correlations in the range 0 - .20 could easily be
 explained by at least a couple of other competing models that would not
 support the conclusions drawn.  Only the most trivial link between
 theoretical models and statistical hypotheses exist in this case.

 Using Alan's ethnicity and statistical ability example, and assuming for
 the moment that all measures were useful, the first time we reject a no
 effect null we have some sort of useful information.  Now, imagine that 12
 researchers generate 12 different hypotheses explaining the cause of these
 differences.  Current practice has all 12 of these researchers collecting
 data and testing to eliminate the chance model and then declaring that
 their hypothesis  has been confirmed.

.
Good example of many of the current problems.

1. If testing the null hypothesis provides no conclusive information, why
structure the experiment around the null hypothesis. I quoted R.A. Fisher in
a previous message, so to repeat it here, I say etcetera. If the outcome
explains the measured outcome, the problem is one, does it do it
conclusively. There is a lot of very well done psychology work that comes to
valid conclusions and gets published in Science. The point is that the work
was very thoughal. You have to be very careful in establishing the research
objectives and the roadmap.

2. Re the 12 researchers with different claimed valid hypotheses. Happens
all the time. Any significant work with startling claims will be retested
using alternate approaches. In this case the proof is not the statistical
test, but the fact that others can demonstrate under different conditions
that the cause put forth by one of the researchers, produces the observed
result. The theory works. If they can't repeat the findings, then regardless
of statistics, the theory is not accepted. There is a lot of stuff in the
"hard sciences" that gets disproved, because it just doesn't hold up under a
more carefull experiment.

3. If what is being done is just mathematical excercises (the main output
from the bulk of the university stat departments). sure then arguing
endlessly about the null hypothesis is fine. One gets visability among ones
peers when this is done. But it sure doesn't help the researchers build up a
really good plan and method to do a first class investigation. I fail to see
why there is so much emphisis on a null hypothesis test if the result really
is not important.

DAHeiser
Not Associated with any Stat Department, School or University




===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscri

Re: Hypothesis testing and magic - episode 2

2000-04-12 Thread dennis roberts

At 09:30 AM 4/13/00 +1000, Alan McLean wrote:

In the ‘soft’ sciences it is easy enough to identify a characteristic of
interest ­ 

alan makes good points as usual ... but i totally object to the term 'soft'
sciences ...

what does soft imply? that the science is bad ... or, that merely that
variables are more 'difficult' to measure ... if that is the case, these
ought to be called the 'hard' sciences

the unpleasant associations with the term 'soft' are uncalled for ... there
are excellent 'scientists' (whatEVER that means) in all fields .. and some
pretty weak ones too (and gee ... BOTH kinds get tenure!) ... 

science is science ... and some practice it well ... some don't ... should
it be some demerit against them that they happen to have opted for a field
of interest ... even if many of the variables are difficult to measure?
perhaps that makes it even more challenging ... 

finally, i would not be so quick to claim that in the areas that are non
social science based ... that variables are all the clear and clean cut ...
there seems to be tremdous infighting about theories and how to 'validate'
them in medicine ... astronomy ... physics ... it is not like everything
there is so simple ... maybe don can pop in here with some relevant
examples ... 

i am sure there are 'mean' differences in terms of these things but ...
there is a lot more WITHin variation in terms of hardness/softness ... that
between disciplines
==
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/droberts.htm


===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===