Has anyone proposed a specific implementation for the Universal Dovetailer
(UD)?  This is a program which runs all possible programs, a little bit
at a time, making progress in all of them.

For something close, here is Greg Chaitin's program to calculate Omega,
the probability that a random program will halt.  It comes from from
http://www.cs.auckland.ac.nz/CDMTCS/chaitin/omega.l, and is written in
his dialect of Lisp.

   [[[[ Omega in the limit from below! ]]]]
    
   define (all-bit-strings-of-size k)
       if = 0 k '(())
       (extend-by-one-bit (all-bit-strings-of-size - k 1))
   define (extend-by-one-bit x)
       if atom x nil
       cons append car x '(0)
       cons append car x '(1)
       (extend-by-one-bit cdr x)
   define (count-halt p)
       if atom p 0
       +
       if = success car try t 'eval read-exp car p
          1 0
       (count-halt cdr p)
   define (omega t) cons (count-halt (all-bit-strings-of-size t))
                    cons /
                    cons ^ 2 t
                         nil

Examples of calling it:

   (omega 0)
   (omega 1)
   (omega 2)
   (omega 3)
   (omega 8)

To read it, keep in mind that Lisp is a prefix style language, so that
the syntax is "operator operands".  Also, the single quote means that
the following argument is quoted rather than evaluated.  The built-in
functions car and cdr return the 1st element of a list and the remainder
of the list, respectively, and cons puts car and cdr back together to
form the original list.

The first two functions just return a list of all bit strings of size k.
These will be the programs that run.  The first function tests if k = 0
and returns (()), otherwise it calls itself on k-1 to get all k-1 bit
strings, then calls extend-by-one-bit.  The latter takes the first
element (car of x) and appends both 0 and 1 to it.  Then it recurses on
the remainder of the list.  So calling all-bit-strings-of-size-k with 1
gives ( (0) (1) ), with 2 gives ( (0 0) (0 1) (1 0) (1 1) ) and so on.
These are the possible programs which will be run.

The count-halt function will return the number of programs in list p
which halt within t steps.  If p is an atom (not a list) it returns 0,
else it tries running car p (the first element of p) for t steps and
counts 1 or 0 based on halt/no-halt.  It recurses on the remainder of
p and adds that result to the 1 or 0.

The key to this function is Chaitin's operator "try", which takes a number
of steps and a program.  It runs the program for that many steps and
returns a success/still-running flag, plus the output from the program
if any.  Above Chaitin is only using the success flag to count whether
the program has halted.

Last we have omega itself, which for parameter t calls count-halt
on all strings of length t. It then shows that this needs to be
divided by 2^t to get the halting probability.  (BTW Chaitin
has actual Lisp interpreters which can run this program at
http://www.cs.auckland.ac.nz/CDMTCS/chaitin/lm.html.)

You could get the effect of a UD, then, by calling omega with successively
larger numbers.  It would run all 1-bit programs for 1 step, then all
2-bit programs for 2 steps, all 3-bit programs for 3 steps, and so on.
It might appear that this will not, for example, run 2-bit programs for
10 steps.  However Chaitin's programs are self-delimiting.  When you
have all 10-bit programs, some of those are 2-bit programs with 8 ignored
bits at the end.  So in running N-bit programs for N steps, we are also
running all K bit programs for N steps, where K < N.

This way of doing a UD is wasteful in that we keep restarting each program
from the beginning.  I think in most conceptions of the UD we assume that
each program's state is retained, so that when its turn comes up again,
it continues from where it left off.  However I think it would be difficult
to manage the storage space for this to work.

Doing it Chaitin's way might appear to change the frequency with
which a program ones so that it departs from the universal measure.
If we have two programs of length K and L, where K << L but both are
large, it should be that the first program gets running time 2^(L-K)
greater than the second program.  However program K actually gets in
addition L-K runs before we even start running program L, as we build
up to programs of size L.  In the end this should not matter though
as this constant factor will decrease in importance as the size of
programs approaches infinity.  Running programs of size N >> L >> K,
K will get running time 2^(L-K) more than L due to its smaller size,
which corresponds to the universal measure.

This leads to three questions:

 - Would all UD programs would correspond to the universal measure,
asymptotically?

 - Is there a way to retain state for all the programs so that you don't
have to start over from the beginning?  (Although Chaitin's way may
be simpler, and if it gives the same probability distribution then we
couldn't tell the difference).

 - Is it necessary to use self-delimiting programs in order for the
notion of running all programs to be well defined, and recover the
universal distribution?

Hal

Reply via email to