I figured out a way to make the Solomonoff Induction iteratively infinite,
so I guess I was wrong.  Thanks for explaining it to me.  However, I don't
accept that it is feasible to make those calculations since an examination
of the infinite programs that could output each individual string would be
required.

My sense is that the statistics of a examination of a finite number of
programs that output a finite number of strings could be used in Solomonff
Induction to to give a reliable probability of what the next bit (or next
sequence of bits) might be based on the sampling, under the condition that
only those cases that had previously occurred would occur again and at the
same frequencyy during the samplings.  However, the attempt to figure the
probabilities of concatenation of these strings or sub strings would be
unreliable and void whatever benefit the theoretical model might appear to
offer.

Logic, probability and compression methods are all useful in AGI even though
we are constantly violating the laws of logic and probability because it is
necessary, and we sometimes need to use more complicated models
(anti-compression so to speak) so that we can consider other possibilities
based on what we have previously learned.  So, I still don't see how
Kolmogrov Complexity and Solomonoff Induction are truly useful except as
theoretical methods that are interesting to consider.

And, Occam's Razor is not reliable as an axiom of science.  If we were to
abide by it we would come to conclusions like a finding that describes an
event by saying that "it occurs some of the time," since it would be simpler
than trying to describe the greater circumstances of the event in an effort
to see if we can find something out about why the event occurred or didn't
occur.  In this sense Occam's Razor is anti-science since it implies that
the status quo should be maintained since simpler is better.  All things
being equal, simpler is better.  I think we all get that.  However, the
human mind is capable of re weighting the conditions and circumstances of a
system to reconsider other possibilities and that seems to be an important
and necessary method in research (and in planning).

Jim Bromer

On Sat, Jul 3, 2010 at 11:39 AM, Matt Mahoney <matmaho...@yahoo.com> wrote:

>   Jim Bromer wrote:
> > You can't assume a priori that the diagonal argument is not relevant.
>
> When I say "infinite" in my proof of Solomonoff induction, I mean countably
> infinite, as in aleph-null, as in there is a 1 to 1 mapping between the set
> and N, the set of natural numbers. There are a countably infinite number of
> finite strings, or of finite programs, or of finite length descriptions of
> any particular string. For any finite length string or program or
> description x with nonzero probability, there are a countably infinite
> number of finite length strings or programs or descriptions that are longer
> and less likely than x, and a finite number of finite length strings or
> programs or descriptions that are either shorter or more likely or both than
> x.
>
> Aleph-null is larger than any finite integer. This means that for any
> finite set and any countably infinite set, there is not a 1 to 1 mapping
> between the elements, and if you do map all of the elements of the finite
> set to elements of the infinite set, then there are unmapped elements of the
> infinite set left over.
>
> Cantor's diagonalization argument proves that there are infinities larger
> than aleph-null, such as the cardinality of the set of real numbers, which
> we call uncountably infinite. But since I am not using any uncountably
> infinite sets, I don't understand your objection.
>
>
> -- Matt Mahoney, matmaho...@yahoo.com
>
>
>  ------------------------------
> *From:* Jim Bromer <jimbro...@gmail.com>
> *To:* agi <agi@v2.listbox.com>
> *Sent:* Sat, July 3, 2010 9:43:15 AM
>
> *Subject:* Re: [agi] Re: Huge Progress on the Core of AGI
>
> On Fri, Jul 2, 2010 at 6:08 PM, Matt Mahoney <matmaho...@yahoo.com> wrote:
>
>>   Jim, to address all of your points,
>>
>> Solomonoff induction claims that the probability of a string is
>> proportional to the number of programs that output the string, where each
>> program M is weighted by 2^-|M|. The probability is dominated by the
>> shortest program (Kolmogorov complexity), but it is not exactly the same.
>> The difference is small enough that we may neglect it, just as we neglect
>> differences that depend on choice of language.
>>
>
>
> The infinite number of programs that could output the infinite number of
> strings that are to be considered (for example while using Solomonoff
> induction to "predict" what string is being output) lays out the potential
> for the diagonal argument.  You can't assume a priori that the diagonal
> argument is not relevant.  I don't believe that you can prove that it isn't
> relevant since as you say, Kolmogorov Complexity is not computable, and you
> cannot be sure that you have listed all the programs that were able to
> output a particular string. This creates a situation in which the underlying
> logic of using Solmonoff induction is based on incomputable reasoning which
> can be shown using the diagonal argument.
>
> This kind of criticism cannot be answered with the kinds of presumptions
> that you used to derive the conclusions that you did.  It has to be answered
> directly.  I can think of other infinity to infinity relations in which the
> potential mappings can be countably derived from the formulas or equations,
> but I have yet to see any analysis which explains why this usage can be.
> Although you may imagine that the summation of the probabilities can be used
> just like it was an ordinary number, the unchecked usage is faulty.  In
> other words the criticism has to be considered more carefully by someone
> capable of dealing with complex mathematical problems that involve the
> legitimacy of claims between infinite to infinite mappings.
>
> Jim Bromer
>
>
>
> On Fri, Jul 2, 2010 at 6:08 PM, Matt Mahoney <matmaho...@yahoo.com> wrote:
>
>>   Jim, to address all of your points,
>>
>> Solomonoff induction claims that the probability of a string is
>> proportional to the number of programs that output the string, where each
>> program M is weighted by 2^-|M|. The probability is dominated by the
>> shortest program (Kolmogorov complexity), but it is not exactly the same.
>> The difference is small enough that we may neglect it, just as we neglect
>> differences that depend on choice of language.
>>
>> Here is the proof that Kolmogorov complexity is not computable. Suppose it
>> were. Then I could test the Kolmogorov complexity of strings in increasing
>> order of length (breaking ties lexicographically) and describe "the first
>> string that cannot be described in less than a million bits", contradicting
>> the fact that I just did. (Formally, I could write a program that outputs
>> the first string whose Kolmogorov complexity is at least n bits, choosing n
>> to be larger than my program).
>>
>> Here is the argument that Occam's Razor and Solomonoff distribution must
>> be true. Consider all possible probability distributions p(x) over any
>> infinite set X of possible finite strings x, i.e. any X = {x: p(x) > 0} that
>> is infinite. All such distributions must favor shorter strings over longer
>> ones. Consider any x in X. Then p(x) > 0. There can be at most a finite
>> number (less than 1/p(x)) of strings that are more likely than x, and
>> therefore an infinite number of strings which are less likely than x. Of
>> this infinite set, only a finite number (less than 2^|x|) can be shorter
>> than x, and therefore there must be an infinite number that are longer than
>> x. So for each x we can partition X into 4 subsets as follows:
>>
>> - shorter and more likely than x: finite
>> - shorter and less likely than x: finite
>> - longer and more likely than x: finite
>> - longer and less likely than x: infinite.
>>
>> So in this sense, any distribution over the set of strings must favor
>> shorter strings over longer ones.
>>
>>
>> -- Matt Mahoney, matmaho...@yahoo.com
>>
>>
>>  ------------------------------
>> *From:* Jim Bromer <jimbro...@gmail.com>
>> *To:* agi <agi@v2.listbox.com>
>> *Sent:* Fri, July 2, 2010 4:09:38 PM
>>
>> *Subject:* Re: [agi] Re: Huge Progress on the Core of AGI
>>
>>
>>
>> On Fri, Jul 2, 2010 at 2:25 PM, Jim Bromer <jimbro...@gmail.com> wrote:
>>>
>>>    There cannot be a one to one correspondence to the representation of
>>>> the shortest program to produce a string and the strings that they produce.
>>>> This means that if the consideration of the hypotheses were to be put into
>>>> general mathematical form it must include the potential of many to one
>>>> relations between candidate programs (or subprograms) and output strings.
>>>>
>>>
>>
>>> But, there is also no way to determine what the "shortest" program is,
>>> since there may be different programs that are the same length.  That means
>>> that there is a many to one relation between programs and program "length".
>>> So the claim that you could just iterate through programs *by length* is
>>> false.  This is the goal of algorithmic information theory not a premise
>>> of a methodology that can be used.  So you have the diagonalization problem.
>>>
>>
>>
>> A counter argument is that there are only a finite number of Turing
>> Machine programs of a given length.  However, since you guys have
>> specifically designated that this theorem applies to any construction of a
>> Turing Machine it is not clear that this counter argument can be used.  And
>> there is still the specific problem that you might want to try a program
>> that writes a longer program to output a string (or many strings).  Or you
>> might want to write a program that can be called to write longer programs on
>> a dynamic basis.  I think these cases, where you might consider a program
>> that outputs a longer program, (or another instruction string for
>> another Turing Machine) constitutes a serious problem, that at the
>> least, deserves to be answered with sound analysis.
>>
>> Part of my original intuitive argument, that I formed some years ago, was
>> that without a heavy constraint on the instructions for the program, it will
>> be practically impossible to test or declare that some program is indeed the
>> shortest program.  However, I can't quite get to the point now that I can
>> say that there is definitely a diagonalization problem.
>>
>> Jim Bromer
>>
>>    *agi* | Archives <https://www.listbox.com/member/archive/303/=now>
>> <https://www.listbox.com/member/archive/rss/303/> | 
>> Modify<https://www.listbox.com/member/?&;>Your Subscription
>> <http://www.listbox.com/>
>>    *agi* | Archives <https://www.listbox.com/member/archive/303/=now>
>> <https://www.listbox.com/member/archive/rss/303/> | 
>> Modify<https://www.listbox.com/member/?&;>Your Subscription 
>> <http://www.listbox.com/>
>>
>
>   *agi* | Archives <https://www.listbox.com/member/archive/303/=now>
> <https://www.listbox.com/member/archive/rss/303/> | 
> Modify<https://www.listbox.com/member/?&;>Your Subscription
> <http://www.listbox.com/>
>    *agi* | Archives <https://www.listbox.com/member/archive/303/=now>
> <https://www.listbox.com/member/archive/rss/303/> | 
> Modify<https://www.listbox.com/member/?&;>Your Subscription
> <http://www.listbox.com/>
>



-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Reply via email to