On 12/29/2013 3:49 PM, Jason Resch wrote:
On Sun, Dec 29, 2013 at 5:42 PM, meekerdb <[email protected]
<mailto:[email protected]>> wrote:
On 12/29/2013 2:08 PM, Jason Resch wrote:
On Sun, Dec 29, 2013 at 4:51 PM, meekerdb <[email protected]
<mailto:[email protected]>> wrote:
On 12/29/2013 1:28 PM, Jason Resch wrote:
On Sun, Dec 29, 2013 at 2:25 PM, meekerdb <[email protected]
<mailto:[email protected]>> wrote:
On 12/29/2013 5:56 AM, Bruno Marchal wrote:
On 28 Dec 2013, at 22:23, meekerdb wrote:
On 12/28/2013 4:09 AM, Bruno Marchal wrote:
For a long time I got opponent saying that we cannot generate
computationally a random number, and that is right, if we want
generate
only that numbers. but a simple counting algorithm generating all
numbers, 0, 1, 2, .... 6999500235148668, ... generates all random
finite incompressible strings,
How can a finite string be incompressible? 6999500235148668 in base
6999500235148669 is just 10.
You can define a finite string as incompressible when the shorter
combinators to generate it is as lengthy as the string itself.
This definition is not universal for a finite amount of short
sequences
which indeed will depend of the language used (here combinators).
Then you can show that such a definition can be made universal by
adding
some constant, which will depend of the universal language.
It can be shown that most (finite!) numbers, written in any base,
are
random in that sense.
Of course, 10 is a sort of compression of any string X in some
base, but
if you allow change of base, you will need to send the base with the
number in the message. If you fix the base, then indeed 10 will be a
compression of that particular number base, for that language, and
it is
part of incompressibility theory that no definition exist working
for all
(small) numbers.
Since all finite numbers are small, I think this means the theory
only
holds in the limit.
Brent
Brent,
It is easy to see with the pigeon hole principal. There are more 2
digit
numbers than 1 digit numbers, and more 3 digit numbers than 2 digit
numbers,
and so on. For any string you can represent using a shorter string,
another
"shorter string" must necessarily be displaced. You can't keep
replacing
things with shorter strings because there aren't enough of them, so as a
side-effect, every compression strategy must represent some strings by
larger
ones. In fact, the average size of all possible compressed messages
(with
some upper-bound length n) can never be smaller than the average size
of all
uncompressed messages.
The only reason compression algorithms are useful is because they are
tailored
to represent some class of messages with shorter strings, while making
(the
vast majority of) other messages slightly larger.
A good explanation.
Thanks.
But just because you cannot compress all numbers of a given size
doesn't imply
that any particular number is incompressible.
That is true if you consider the size of the compression program to be of no
relevance. In such a case, you can of course have a number of very small
strings
map directly to very large ones.
So isn't it the case that every finite number string is compressible
in some
algorithm? So there's no sense to saying 6999500235148668 is random,
but
11111111111111 is not, except relative to some given compression
algorithm.
Right, but this leads to the concept of Kolmogorov complexity. If you
consider the
size of the minimum string and algorithm together, necessary to represent
some
number, you will find there are some patterns of data that are more
compressible
than others. In your previous example with base 6999500235148668, you
would need
to include both that base, and the string "10" in order to encode
6999500235148669.
But that seems to make the randomness of a number dependent on the base
used to
write it down? Did I have to write down "And this is in base 10" to show
that
6999500235148668 is random? There seems to be an equivocation here on
"computing a
number" and "computing a representation of a number".
A number containing regular patterns in some base, will also contain regular patterns in
some other base (even if they are not obvious to us), compression algorithms are good at
recognizing them.
The text of this sentence may not seem very redundant, but english text can generally be
compressed somewhere between 20% - 30% of its original size. If you convert a number
like "55555555555" to base 2, its patterns should be more evident in the pattern of bits.
For the majority of numbers, you will find the Kolmogorov complexity of
the number
to almost always be on the order of the number of digits in that number.
The
exceptions like 1111111111 are few and far between.
111111111 looks a lot messier in base 9.
base 10: 1111111111111111111
base 9: 7355531854711617707
base 2: 11110110101101110101101010110010101111000100011100
In base 9, there is a high proportion of 7's compared to other digits. In base 2, the
sequence '110' seems more common than statistics would suggest. In any case, the number
is far from incompressible. It takes only 9 characters to represent that 19 digit number
in Kolmogorov complexity "1r19inb10" = "1 repeated 19 times in base 10", in my encoding
language.
So you are agreeing with me that to cite a specific number and say "That number is
random." is meaningless.
Brent
--
You received this message because you are subscribed to the Google Groups
"Everything List" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/everything-list.
For more options, visit https://groups.google.com/groups/opt_out.