On Fri, Feb 18, 2011 at 03:46:45PM -0800, Travis Garrett wrote:
Sorry for the slow reply, I have been working on various things and
also catching up on the many conversations (and naming conventions)
this board. And thanks for your interest! -- I think I have
discovered a giant "low hanging fruit", which had previously gone
unnoticed since it is rather nonintuitive in nature (in addition to
being a subject that many smart people shy away from thinking
Ok, let me address the Faddeev-Popov, "gauge-invariant information"
issue first. I'll start with the final conclusion reduced to its
basic essence, and give more concrete examples later. First, note
that any one "structure" can have many different "descriptions".
counting among different structures thus it is crucial to choose only
one description per structure, as including redundant descriptions
will spoil the calculation. In other words, one only counts over the
gauge-invariant information structures.
This is essentially what one does in the derivation of the
Solomonoff-Levin distribution, aka "Universal Prior". That is, fix a
universal prefix Turing machine, which halts on all input. Then all
input programs generating the same output are considered
equivalent. The universal prior for a given output is given by summing
over the equivalence class of inputs giving that output, weighted
exponentially by the length of the unique prefix.
This result (which dates from the early 70s) gives rise to the various
Occams razor theorems that have been published since. My own modest
contribution was to note that any classifier function taking bit
strings as input and mapping them to a discrete set (whether integers,
or meanings, matters not) in a prefix way (the meaning of the string,
once decided, does not change on reading more bits) will work. Turing
machines are not strictly needed, and one expects observers to behave
this way, so an Occams razor theorem will apply to each and every
observer, even if the observers do not agree on the relative
complexities of their worlds.
However, this only suffices to eliminate what Bruno would call "3rd
person white rabbits". There are still 1st person white rabbits that
arise through the failure of induction problem. I will explain my
solution to that issue further down.
A very important lemma to this is that all of the random noise is
also removed when the redundant descriptions are cut, as the random
noise doesn't encode any invariant structure. Thus, for instance, I
agree with COMP, but I disagree that white rabbits are therefore a
problem... The vast majority of the output of a universal dovetailer
(which I call A in my paper) is random noise which doesn't actually
describe anything (despite "optical illusions" to the contrary...)
can therefore be zapped, leaving the union of nontrivial, invariant
structures in U (which I then argue is dominated by the observer
O due to combinatorics).
It is important to remember that random noise events are not white
nice physicsy example of the distinction is to consider a room full of
air. The random motion of the molecules are not white rabbits, that is
just normal thermal noise. All of the molecules being situated in one
small corner of the room, however, so that an observer sitting in
ends up suffocating is a white rabbit. One could say that white
rabbits are extremely low entropy states that happen by chance, which
is the key to udnerstanding why they're never observed. To be low
entropy, the state must have significance to the observer, as well as
being of low probability. Otherwise, any arbitrary configuration will
have low entropy.
When observing data, it is important that observers are relatively
insensitive to error. It does not help to not recognise a lion in the
African savannah, just because it is partically obscured by a
tree. Computers used to be terrible at just this sort of problem - you
needed the exact key to extract a record from a database - now various
sorts of fuzzy techniques, particularly ones inspired by the neural
structure in the brain - mean computers are much better at dealing
wiuth noisy data. With this observation, it becomes clear that the
myriad of nearby histories that differ only in a few bits are not
recognised as different from the original observation. These are not
white rabbits. It requires many bits to make a white rabbit, and this,
as you eloquently point out, is doubly exponentially suppressed.
Bruno will probably still comment that this does not dispose of all
the 1st person white rabbits, but I fail to see what other ones