>On 29 Jan 2001 10:29:36 -0800, [EMAIL PROTECTED] (dennis roberts) wrote:

>> "P values, or significance levels, measure the strength of the evidence 
>> against the null hypothesis; the smaller the P value, the stronger the 
>> evidence against the null hypothesis"
>> 
>> 1. does the general statistical community accept this as being correct?

Rich Ulrich  <[EMAIL PROTECTED]> wrote:

>I think that the problem is, in short, "N" --
>
>For a given design/ study/  etc./  , there seems to be one "N" 
>in use, so the evidence against the null is indexed by p, if we want
>to.  Or there is effect size, but Effect is hard to balance against p
>when there are varying Ns.

There is a Bayesian argument (Lindley's "paradox") about how p-values
don't behave properly as N changes (at least if you have a null that
might be exactly true).  But if you believe this or other general
statistical arguments about why p-values are bad, then you should not
be using p-values.  The whole point of a p-value is to capture the
strength of evidence against the null, independently of things like N,
or the level of measurement error, as described in the passage quoted
by Dennis Roberts.  I'm amazed that anyone would question this.
P-values may or may not do a good job of this, but that IS what their
job is, and you either use them that way or you don't use them at all.

>I base my "decision," sometimes, on the notion of 
>"How much noise is in the process?"  -  
>Two variables that correlate r= 0.2 are barely beating 
>"response bias", if there is that sort of  yes/no tendency 
>affecting those items.  So I am not impressed with r=0.2, 
>even though the N might make it p=.001.

Well, sure.  You have to consider the possibility of bias.  But
p-values were never meant to handle THAT job.  Nothing is gained by
confusing this issue with the issue of what a p-value DOES mean.

>For much clinical work,  d= 0.5 SD is a frequent standard, 
>needing N=64  to meet the 5% test. When just-barely
>meeting that standard,  there is a 95% chance that 
>the underlying mean is above 0.0, right?  - that is the
>meaning of the test.

Well, no.  That isn't the meaning of a p-value.  But it seems that the
temptation to put a Bayesian interpretation on a p-value is almost
irresistable, and this is probably the best argument against their use.

>For instance, I start out dubious about medical results that 
>publish Odds Ratios of 1.2 or 1.5, even though the p-level might
>be small because the N= 20,000.  I have in mind, especially,
>those uncontrolled  "observational studies"  that are rife
>with self-selection.  - I may end up believing the result,
>but I do want to know that the role of artifact was ruled out.

P-values were certainly never meant to address the question of whether
the observed effect is or is not causal.

>Similar to before, it should be possible to show either a 
>1-tailed CI  or a 50% range that lies *above*  the range of 
>trivial and artifactual results.

Computing a confidence interval does seem like a good idea if you're
worried about bias and think you have an idea of how big it might be.
I suspect, however, that you're putting a Bayesian interpretation on
your confidence intervals too.  So maybe you should just use a
Bayesian model?

   Radford Neal

----------------------------------------------------------------------------
Radford M. Neal                                       [EMAIL PROTECTED]
Dept. of Statistics and Dept. of Computer Science [EMAIL PROTECTED]
University of Toronto                     http://www.cs.utoronto.ca/~radford
----------------------------------------------------------------------------


=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to