On 21 Mar 2003 16:51:15 -0800, [EMAIL PROTECTED] (Dennis Roberts) wrote:

> At 09:55 PM 3/21/03 +0000, Jerry Dallal wrote:
> >dennis roberts wrote:
> >
> > > could someone give an example or two ... of how p values have really
> > > advanced our knowledge and understanding of some particular phenomenon?
> >
> >Pick up any issue of JAMA or NEJM.
> 
> no ... that is not sufficient ... just to look at a journal ... where p 
> values are used ... does not answer the question above ... that's circular 
> and just shows HOW they are used ... not what benefit is derived FROM there use
> 
> i would like an example or two where ... one can make a cogent argument 
> that p ... in it's own right ... helps us understand the SIZE of an effect 
> ... the IMPORTANCE of an effect ... the PRACTICAL benefit of an effect
> 
> maybe you could select one or two instances from an issue of the journal 
> ... and lay them out in a post?

Here's my selection of instances.  

Whenever I read of striking epidemiology results 
(usually, first in the newspaper), I *always*  calculate 
the p-level for whatever the gain was.  
It  *very*  often is barely beyond  5%.   
Also, it  is *very*  often based on testing, say, 20 
or 50 dietary items; or on some intervention with 
little experimental history.

So I say to myself, "As far as being a notable 'finding'  
goes, this does not survive adjustment for multiple-testing.
I will doubt it."

Then, I am not at all surprised with the followup studies in
five or ten years  show that eggs are not so bad after all,
real butter is not so bad, or Hormone Replacement Therapy
is not necessarily so good.  Serious students of the fields 
were not at all surprised, basically, for the same reasons 
that I'm not surprised.

Quite often, as with the HRT, there is the additional problem
of non-randomization.  For that, I also consider the effect size, 
too, and say, "Can I imagine particular  biases that could 
cause this result?"   My baseline here is sometimes the 
'healthy-worker effect' -- Compared with the non-working 
population, workers have age-adjusted SMR  (Standardized 
Mortality Ratio) for heart diseases of about 0.60.  That's 
mostly blamed on selection, since sick people quit work.
Something that is highly "significant"  can still be unbelievable,
but I hardly waste any time before I disbelieve something
that is 5%  "significant"  and probably biased.


If you pay no attention to the p-value, you toss out
what is arguably the single most useful screen we have.

Of course, Herman gives the example where the 
purpose is different.  We may want to explore potential
new leads, instead of setting our standards to screen 
out trivial results.  It seems to me that the "decision theorist"  
is doing a poor job of discriminating tasks here.

I certainly never suggested that there is only one way
to regard results -- I have periodically told posters that
their new studies with small N and many variables 
seemed to set them up with "exploratory"  paradigms,
rather than "testable" hypotheses.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to