In article <005101bfa4fb$cc623d00$[EMAIL PROTECTED]>,
David A. Heiser <[EMAIL PROTECTED]> wrote:
>> Except for posterior probability, none of these are tools
>> for the actual problems.  And posterior probability is not
>> what is wanted; it is the posterior risk of the procedure.

>> But even this relies on belief.  An approach to rational
>> behavior makes the prior a weighting measure, without
>> ringing in belief.  I suggest we keep it this way, and
>> avoid the philosophical aspects


>Disagree.

>1. Determination of risk requires a model which is based on a belief system
>(.e.g. there is/is not a minimum level of tremolite that causes
>mesothelioma). Probability is difficult enough to deal with, let alone an
>additional swamp called "risk". Those of us who have thought about
>developing outcomes in terms of risk, basically have had to give it up. The
>difference in interpretation of a risk value between different people is
>much to great.

As to the difference between different people, I doubt that 
anyone questions this.  Neither probability nor risk is that
hard to think about, unless you try to make it hard or define
either or both of them.  Once that attempt is made, the
philosophical problems seem to become hopeless.

>2. Weighing is again based on a belief system. Everything is not equal. Some
>are more important than others.

It is the importance system which counts.  You are quite right;
some are more important.  It is not the statistician's job to
decide this; in fact, the statistician, as a statistician, must
not do it.

>>The data consists of what has been observed.  The likelihood
>>principle then mandates that the probabilities of unobserved
>>events becomes irrelevant.  This means that the typical test
>>procedures (NOT the test STATISTICS) would have to be wrong.

>3. This does not make sense. It needs something in addition.

No.  Fisher pushed for the likelihood principle, but he also
pushed for statistical significance, and these are incompatible.
If one is combining independent experiments, the likelihood
functions multiply.  As the likelihood function is a sufficient
statistic, it contains all the information in the data.

But the same likelihood function can give quite different 
p-values in different experiments.  Only a direct use of the
likelihood function avoids the paradoxes.  

>I wrote:
>>>Let us supose there are many plausible hypotheses. These include the
>>>"nil hypothesis" any priori hypotheses any idea at all that may be
>>>considered. Refer to these in terms of set of all plausible hypothesis
>>>(including that of no effect) that are to be tested.

>>The set of all plausible hypotheses is generally uncountable,
>>even in the discrete case.

>>>The process is to pick each hypothesis and test it.

>>This cannot be done; there are too many.

>4. If this is the result, then you have a really, really bad experiment. You
>haven't thought about the problem and defined a finite region for
>exploration. I sure could not do a PhD thesis and have it accepted if I
>didn't have a defined region and objectives for the research.

Statistics is not for PhD theses alone.  In many PhD theses, 
it is used only because this has become the religion.

> 5. Let me quote R.A. Fisher "he (the investigator) should only claim that a
>phenomenon is experimentally demonstrated when he knows how to design an
>experiment so that it will rarely fail to give a significant result" (Fisher
>1929b). The experiment is then the means to obtain data to test the chosen
>hypotheses.

Fisher believed in point null hypotheses.  Even if these
are tenable as a scientific possibility, that is not what
is actually being tested.  The speed of light in vacuum may
be constant, but as there is no vacuum with light in it,
this is not what is even being measured.

As for statistical significance, it fails when one looks
carefully at the problem.  About 50 years ago, much effort
from strong theorists went into trying to justify such
practices; they all failed.  Analyzing the testing of
simple against simple, all that survives tests of self
consistency are Neyman-Pearson tests with the same
likelihood ratio cuts; in other words, Bayes procedures.
If one uses the likelihood principle and the risk
principle, the same is true in general.

As perfection cannot be obtained, one can ask if the 
usual test procedures are at least good approximations.
Unfortunately, they are not.  If one is testing a null,
the usual test statistics are likely to be good, but 
the "significance level" to be used becomes highly 
dependent on sample size.
-- 
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
[EMAIL PROTECTED]         Phone: (765)494-6054   FAX: (765)494-0558


===========================================================================
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================

Reply via email to