# Re: Code length = probability distribution

On Fri, Oct 19, 2012 at 02:03:27PM -0700, meekerdb wrote:
> On 10/19/2012 10:54 AM, Stephen P. King wrote:
> >Hi,
> >
> >    I was looking up a definition and found the following:
> >http://en.wikipedia.org/wiki/Minimum_description_length
> >"Central to MDL theory is the one-to-one correspondence between
> >code length functions and probability distributions. (This follows
> >from the Kraft-McMillan inequality.) For any probability
> >distribution , it is possible to construct a code  such that the
> >length (in bits) of  is equal to ; this code minimizes the
> >expected code length. Vice versa, given a code , one can construct
> >a probability distribution such that the same holds. (Rounding
> >issues are ignored here.) In other words, searching for an
> >efficient code reduces to searching for a good probability
> >distribution, and vice versa."
> >
> >    Is this true? Would it be an approach to the measure problem of COMP?
> >
>
> Although the display math didn't show up above, I looked at the
> site.  I'm not clear on what x refers to in P(x) and C(x).  Is it
> the data being fitted, so in the coin tossing example x is a
> sequence, e.g. x={HHTTHHTHTT} and P(x) is the probability of various
> possible sequences, i.e. ignoring order but not length, it's the
> binomial distribution?  Then C(x) would be the length of the code to
> produce/describe a particular sequence x?  So a sequence
> {HHHHHHHHHHH} would have a shorter code than {TTHHTHTHTHTHH} - but
> ex hypothesi they both have the same probability?  What am I
> missing?
>
> Brent
> 

Assuming the notation of Li & Vitanyi,

C(x) would be prefix-free algorithmic complexity, which is closely
related to Kolmogorov complexity K(x) (length of shortest program):

K(x) + K_1 \leq C(x) \leq K(x) + K_2

P(x) would be the universal prior probability, discovered by
Solomonoff, and revised by Levin. It is defined by

P(x) = 2 ^ {-C(x)}

where we've assumed that C(x) is being measured in bits.

I haven't read the MDL stuff in Li & Vitanyi (too many other things,
and L&V is a "fierce book" (one of my friends' description), requiring
much study).

But I think the connection between probability distribution and
shortest length is made clear above.

Cheers

--

----------------------------------------------------------------------------
Prof Russell Standish                  Phone 0425 253119 (mobile)
Principal, High Performance Coders
Visiting Professor of Mathematics      hpco...@hpcoders.com.au
University of New South Wales          http://www.hpcoders.com.au
----------------------------------------------------------------------------

--
You received this message because you are subscribed to the Google Groups
"Everything List" group.
To post to this group, send email to everything-list@googlegroups.com.
To unsubscribe from this group, send email to