Re: p values

Herman Rubin Mon, 24 Mar 2003 12:29:59 -0800

In article <[EMAIL PROTECTED]>,
Rich Ulrich  <[EMAIL PROTECTED]> wrote:
>On 21 Mar 2003 16:51:15 -0800, [EMAIL PROTECTED] (Dennis Roberts) wrote:


>> At 09:55 PM 3/21/03 +0000, Jerry Dallal wrote:
>> >dennis roberts wrote:

>> > > could someone give an example or two ... of how p values have really
>> > > advanced our knowledge and understanding of some particular phenomenon?

>> >Pick up any issue of JAMA or NEJM.

>> no ... that is not sufficient ... just to look at a journal ... where p 
>> values are used ... does not answer the question above ... that's circular 
>> and just shows HOW they are used ... not what benefit is derived FROM there use

>> i would like an example or two where ... one can make a cogent argument 
>> that p ... in it's own right ... helps us understand the SIZE of an effect 
>> ... the IMPORTANCE of an effect ... the PRACTICAL benefit of an effect

>> maybe you could select one or two instances from an issue of the journal 
>> ... and lay them out in a post?

>Here's my selection of instances.  

>Whenever I read of striking epidemiology results 
>(usually, first in the newspaper), I *always*  calculate 
>the p-level for whatever the gain was.  
>It  *very*  often is barely beyond  5%.   
>Also, it  is *very*  often based on testing, say, 20 
>or 50 dietary items; or on some intervention with 
>little experimental history.

>So I say to myself, "As far as being a notable 'finding'  
>goes, this does not survive adjustment for multiple-testing.
>I will doubt it."

The problem with this is that we have at least 50 major 
factors involved, and if the adjustment for multiple 
testing is made, nothing can be statistically significant.
In fact, 50 may even be an underestimate; I believe that
there are more than 50 known antioxidants.

>Then, I am not at all surprised with the followup studies in
>five or ten years  show that eggs are not so bad after all,
>real butter is not so bad, or Hormone Replacement Therapy
>is not necessarily so good.  Serious students of the fields 
>were not at all surprised, basically, for the same reasons 
>that I'm not surprised.

It is even worse that this; in many of the cases, there
were not even studies.  It was stated that eggs were bad
because eggs have cholesterol, and cholesterol levels IN
THE BLOOD were bad.  The rest was conclusion jumping.
At this time, we have no good studies of any dietary
effects other than weight.

>Quite often, as with the HRT, there is the additional problem
>of non-randomization.  For that, I also consider the effect size, 
>too, and say, "Can I imagine particular  biases that could 
>cause this result?"   My baseline here is sometimes the 
>'healthy-worker effect' -- Compared with the non-working 
>population, workers have age-adjusted SMR  (Standardized 
>Mortality Ratio) for heart diseases of about 0.60.  That's 
>mostly blamed on selection, since sick people quit work.
>Something that is highly "significant"  can still be unbelievable,
>but I hardly waste any time before I disbelieve something
>that is 5%  "significant"  and probably biased.

One of the problems with medical studies is subjects 
dropping out, causing the distribution of the subjects
in the experimental and control groups to become
different.  The standard procedure is to ignore this
if the proportions are "not statistically significant."
This effect, however, might be greater than what is
being investigated.

>If you pay no attention to the p-value, you toss out
>what is arguably the single most useful screen we have.

The question is, what is being screened?  Any decision
approach indicates that the critical p-value should
decrease with increasing sample size; this means that
it should increase with decreasing sample size.  It is
often the case that evidence is needed to accept.

>Of course, Herman gives the example where the 
>purpose is different.  We may want to explore potential
>new leads, instead of setting our standards to screen 
>out trivial results.  It seems to me that the "decision theorist"  
>is doing a poor job of discriminating tasks here.

The loss-prior combination must be given by the 
investigator, NOT the statistician.  To do this, they
need to think probabilitistically, not in terms of
statistical methods.

>I certainly never suggested that there is only one way
>to regard results -- I have periodically told posters that
>their new studies with small N and many variables 
>seemed to set them up with "exploratory"  paradigms,
>rather than "testable" hypotheses.

If one has the "clean" situations of the early physical
sciences, one can get away with some of this.  However, the
"Kepler problem" is noted; if there was one less decimal
place in the available data, it would have been impossible
to distinguish between a circle and an ellipse, and if
there was one more place, as happened with telescopic
data, orbits as Kepler considered them did not even exist.
And what would have been the development of gas laws if
cases which did not behave like ideal gases had to be
considered?  No quantitative fit, other than ad hoc, was
obtained from the data; theory was needed.  The idea that
theory is generated from data is wrong; data can only help
confirm or deny.
-- 
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Deptartment of Statistics, Purdue University
[EMAIL PROTECTED]         Phone: (765)494-6054   FAX: (765)494-0558
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Re: p values

Reply via email to