On Sun, Dec 29, 2013 at 11:54 PM, meekerdb <[email protected]> wrote:

>  On 12/29/2013 7:45 PM, Jason Resch wrote:
>
>
>
>
> On Sun, Dec 29, 2013 at 6:58 PM, meekerdb <[email protected]> wrote:
>
>>   On 12/29/2013 3:49 PM, Jason Resch wrote:
>>
>>
>>
>>
>> On Sun, Dec 29, 2013 at 5:42 PM, meekerdb <[email protected]> wrote:
>>
>>>   On 12/29/2013 2:08 PM, Jason Resch wrote:
>>>
>>>
>>>
>>>
>>> On Sun, Dec 29, 2013 at 4:51 PM, meekerdb <[email protected]> wrote:
>>>
>>>>  On 12/29/2013 1:28 PM, Jason Resch wrote:
>>>>
>>>>
>>>>
>>>>
>>>> On Sun, Dec 29, 2013 at 2:25 PM, meekerdb <[email protected]> wrote:
>>>>
>>>>>  On 12/29/2013 5:56 AM, Bruno Marchal wrote:
>>>>>
>>>>>
>>>>>  On 28 Dec 2013, at 22:23, meekerdb wrote:
>>>>>
>>>>>  On 12/28/2013 4:09 AM, Bruno Marchal wrote:
>>>>>
>>>>> For a long time I got opponent saying that we cannot generate
>>>>> computationally a random number, and that is right, if we want generate
>>>>> only that numbers. but a simple counting algorithm generating all numbers,
>>>>> 0, 1, 2, .... 6999500235148668, ... generates all random finite
>>>>> incompressible strings,
>>>>>
>>>>>
>>>>> How can a finite string be incompressible?  6999500235148668 in base
>>>>> 6999500235148669 is just 10.
>>>>>
>>>>>
>>>>>
>>>>>  You can define a finite string as incompressible when the shorter
>>>>> combinators to generate it is as lengthy as the string itself.
>>>>> This definition is not universal for a finite amount of short
>>>>> sequences which indeed will depend of the language used (here 
>>>>> combinators).
>>>>>
>>>>>  Then you can show that such a definition can be made universal by
>>>>> adding some constant, which will depend of the universal language.
>>>>>
>>>>>  It can be shown that most (finite!) numbers, written in any base,
>>>>> are random in that sense.
>>>>>
>>>>>  Of course, 10 is a sort of compression of any string X in some base,
>>>>> but if you allow change of base, you will need to send the base with the
>>>>> number in the message. If you fix the base, then indeed 10 will be a
>>>>> compression of that particular number base, for that language, and it is
>>>>> part of incompressibility theory that no definition exist working for all
>>>>> (small) numbers.
>>>>>
>>>>>
>>>>>  Since all finite numbers are small, I think this means the theory
>>>>> only holds in the limit.
>>>>>
>>>>> Brent
>>>>>
>>>>
>>>>
>>>>  Brent,
>>>>
>>>>  It is easy to see with the pigeon hole principal.  There are more 2
>>>> digit numbers than 1 digit numbers, and more 3 digit numbers than 2 digit
>>>> numbers, and so on.  For any string you can represent using a shorter
>>>> string, another "shorter string" must necessarily be displaced.  You can't
>>>> keep replacing things with shorter strings because there aren't enough of
>>>> them, so as a side-effect, every compression strategy must represent some
>>>> strings by larger ones.  In fact, the average size of all possible
>>>> compressed messages (with some upper-bound length n) can never be smaller
>>>> than the average size of all uncompressed messages.
>>>>
>>>>  The only reason compression algorithms are useful is because they are
>>>> tailored to represent some class of messages with shorter strings, while
>>>> making (the vast majority of) other messages slightly larger.
>>>>
>>>>
>>>>  A good explanation.
>>>>
>>>
>>>  Thanks.
>>>
>>>
>>>>  But just because you cannot compress all numbers of a given size
>>>> doesn't imply that any particular number is incompressible.
>>>>
>>>
>>>  That is true if you consider the size of the compression program to be
>>> of no relevance.  In such a case, you can of course have a number of very
>>> small strings map directly to very large ones.
>>>
>>>
>>>>   So isn't it the case that every finite number string is compressible
>>>> in some algorithm?  So there's no sense to saying 6999500235148668 is
>>>> random, but 11111111111111 is not, except relative to some given
>>>> compression algorithm.
>>>>
>>>
>>>  Right, but this leads to the concept of Kolmogorov complexity. If you
>>> consider the size of the minimum string and algorithm together, necessary
>>> to represent some number, you will find there are some patterns of data
>>> that are more compressible than others.  In your previous example with base
>>> 6999500235148668, you would need to include both that base, and the string
>>> "10" in order to encode 6999500235148669.
>>>
>>>
>>>  But that seems to make the randomness of a number dependent on the
>>> base used to write it down? Did I have to write down "And this is in base
>>> 10" to show that 6999500235148668 is random?  There seems to be an
>>> equivocation here on "computing a number" and "computing a representation
>>> of a number".
>>>
>>>
>>>
>>  A number containing regular patterns in some base, will also contain
>> regular patterns in some other base (even if they are not obvious to us),
>> compression algorithms are good at recognizing them.
>>
>>  The text of this sentence may not seem very redundant, but english text
>> can generally be compressed somewhere between 20% - 30% of its original
>> size.  If you convert a number like "55555555555" to base 2, its patterns
>> should be more evident in the pattern of bits.
>>
>>
>>>     For the majority of numbers, you will find the Kolmogorov
>>> complexity of the number to almost always be on the order of the number of
>>> digits in that number.  The exceptions like 1111111111 are few and far
>>> between.
>>>
>>>
>>>  111111111 looks a lot messier in base 9.
>>>
>>>
>>
>>  base 10: 1111111111111111111
>>  base 9: 7355531854711617707
>> base 2: 11110110101101110101101010110010101111000100011100
>>
>>  In base 9, there is a high proportion of 7's compared to other digits.
>> In base 2, the sequence '110' seems more common than statistics would
>> suggest.  In any case, the number is far from incompressible. It takes only
>> 9 characters to represent that 19 digit number in Kolmogorov complexity
>> "1r19inb10" = "1 repeated 19 times in base 10", in my encoding language.
>>
>>
>>  So you are agreeing with me that to cite a specific number and say
>> "That number is random." is meaningless.
>>
>
>  I agree with that in the sense of "random as unpredictable", but I
> disagree in the sense of "random as uncompressible". Some numbers are
> objectively not compressible, just like some shufflings of a deck of cards
> are uncompressible, because the shortest possible description of the
> ordering of the cards requires more information to describe than merely
> giving the list itself.  So it is with a number and its digits.
>
>
> I think we're talking past each other.  What you're calling a number is
> what I'd call a string of digits.  I can understand a string of digits
> being incompressible...but the number it names has many representations.
>

I agree we are walking a fine line between mathematics and information
theory, but they are intimately related.  Every string of digits
corresponds to number. I suppose that your point might be that every number
has many encodings.  However, in the context of information theory, there
is an absolute minimum amount of information *required* to communicate a
given number to another (when including the information necessary to define
/ describe whatever performs the decoding) of that information to produce
the given number.

If not for this, then there would never be any need for long downloads.  We
could encode any arbitrary file as some number, work out an efficient
encoding of that number into some very short information, transmit that
short information, and then decode that information back to the number
(which is the file we wanted to download).  The amount of information
required to represent some piece of information is directly tied to
counting the number of possible states a given message has.

Consider a movie in 1024x768 pixels, 30 frames per second, 3 bytes per
pixel (R,G,B), that is 10 minutes long.  The total number of possible
movies of this complexity is 256^(1024 x 768 x 30 x 3 x 10 x 60) or
256^(42,467,328,000), a number so large it requires 42 billion bytes just
to represent a number this big.  As it happens, the amount of information
necessary to encode such a movie is also 42 billion bytes (or 42 GB), and
it is precisely related to the number of possibilities (each possibility
requires a unique encoding, since the movies are one-to-one with the
encodings, and there are 256^(42,467,328,000) of them).



> To say a number is incompressible??  There's an old joke proof in
> mathematics that every number is interesting, otherwise there would be a
> smallest uninteresting number - which would peforce be interesting.  It
> seems that interesting means something like "has a short description".
>

In the space of all possible movies, the ones that are watchable or
meaningful to human viewers would all be highly compressible. The ones that
are random snow, despite containing more information, would not make
interesting movies.  So maybe there is something to your idea that
interesting is related to short descriptions. We did evolve to find
entirely predictable and entirely unpredictable things boring, there may be
some ideal blend of predictability and unpredicability that we find most
engaging.

Jason


>
> Brent
>
> --
> You received this message because you are subscribed to the Google Groups
> "Everything List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/everything-list.
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Everything List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/everything-list.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to