Re: Predictions & duplications

juergen Tue, 23 Oct 2001 08:52:37 -0700

> From: [EMAIL PROTECTED]:
> [EMAIL PROTECTED] wrote:
> > > From [EMAIL PROTECTED]:
> > > [EMAIL PROTECTED] wrote:
> > > > M measure:
> > > > M(empty string)=1
> > > > M(x) = M(x0)+M(x1) nonnegative for all finite x.
> > > 
> > > This sounds more like a probability distribution than a measure. In
> > > the set of all descriptions, we only consider infinite length
> > > bitstrings. Finite length bitstrings are not members. However, we can
> > > identify finite length bitstrings with subsets of descriptions. The
> > > empty string corresponds to the full set of all descriptions, so the
> > > first line M(empty string)=1 implies that the measure is normalisable
> > > (ie a probability distribution).
> > 
> > Please check out definitions of measure and distribution! 
> > Normalisability is not the critical issue.
> > 
> > Clearly: Sum_x M(x) is infinite. So M is not a probability
> > distribution. M(x) is just measure of all strings starting with x:
> > M(x) = M(x0)+M(x1) = M(x00)+M(x01)+M(x10)+M(x11) = ....
> 
> For a measure to be normalisable, the sum over a *disjoint* set of
> subsets must be finite. If the set of subsets is not disjoint (ie the
> intersection are not empty) then the sum may well be infinite.
> Bringing this to the case of finite strings. Each finite string is
> actually a subset of the set of infinite strings, each containing the
> same finite prefix. So the string 101010 is actually a subset of 10101
> and so on. The Sum_x M(x), where I assume x ranges over all strings
> will of course be infinite. However, since the set of finite strings
> is not disjoint, this doesn't imply M(x) is unnormalisable.
> Now when you realise that every finite string x is a subset of the empty
> string, it becomes clear that M(x) is normalised to precisely 1.


The point is: prob dists and measures are different things. There is a
good reason for giving them different names. Prob dists assign numbers
to individual objects, not to sets.  Traditional definitions as well as
those for semimeasures in http://www.idsia.ch/~juergen/toesv2/

> > Neglecting finite universes means loss of generality though.
> > Hence measures mu(x) in the ATOE paper do not neglect finite x: 
> > 
> > mu(empty string)=1
> > mu(x) = P(x)+mu(x0)+mu(x1)  (all nonnegative).
> > 
> > And here P is a probability distribution indeed! 
> > P(x)>0 possible only for x with finite description. 
> > 
> 
> The P(x)>0 case above actually breaks the countably subadditive
> property, so mu(x) cannot be called a measure... I'm not entirely sure
> what you're getting at here.

The set of computable universes includes the finite ones which we may 
not ignore. How do they contribute to the measure of all universes?
For convenience introduce a third symbol $ besides 0 and 1. Now what is 
the measure of the set of all universes starting with 00110? It's the sum of
the measure of the set of universes starting with 001101 and
the measure of the set of universes starting with 001100 and
the measure of the single universe 00110$ that stops right after 00110. 
Once there is a $ symbol there won't be another 0 or 1.  
So mu(x) = P(x)+mu(x0)+mu(x1).  

Our favorite example is the universal Turing machine with random inputs.
For finite x: P(x) is the probability that the TM output is x and nothing
but x. mu(x) is something different, namely, the so-called universal measure 
of all strings _starting_ with x.  Again mu(x) = P(x)+mu(x0)+mu(x1).  

Juergen Schmidhuber                            http://www.idsia.ch/~juergen/

Re: Predictions & duplications

Reply via email to