>There is another option: sometimes prior probabilities are based on
>frequencies
>
>and sometimes they are based exclusively or principally on something like
>"assumptions."

Excuse me, but even when prior probabilities are based on frequencies, the
use of these frequencies is based on that assumptions:

1.  The things one is counting up to obtain the frequencies of have
something to do with each other, so that the frequency in question is a
meaningful frequency; and
2.  This meaningful frequency has something to do with the phenomenon for
which it is being used to estimate the prior distribution.

Now maybe in a particular case you think you can justify THESE assumptions
(perhaps on the basis of frequencies about problems similar to this one --
which would lead to what the statisticians call a hierarchical Bayesian
model).  In the end, though, at some point you have to just stop and rest
it all on some kind of assumption.

What we try to do in science is to be parsimonious about the assumptions we
introduce and to ensure to the extent possible that they all cohere with
each other.  Bayesianism helps us with that.  If we try to be explicitly
Bayesian, we find that this will help us to discover and weed out
inconsistencies.  Also, between two models that fit the data equally well,
the one with fewer free parameters will tend to dominate.  This is called
the "natural Occam's Razor" associated with Bayesian analysis.  It's why
fully Bayesian machine learning algorithms give you parsimonious models
without ad hoc approaches to avoid overfitting.

>Witness, for example, what happens otherwise: Professor Alan
>Dershowitz argues that evidence that Simpson previously beat his wife before
>her death has little probative value because very few wife-beaters go on to
>kill their wives.

I remember in the days of my wild youth arguing against my  rabidly
anti-marijuana elders that all heroin users started on milk!

Kathy Laskey

Reply via email to