On Fri, Jul 2, 2010 at 6:08 PM, Matt Mahoney <[email protected]> wrote:
> Jim, to address all of your points, > > Solomonoff induction claims that the probability of a string is > proportional to the number of programs that output the string, where each > program M is weighted by 2^-|M|. The probability is dominated by the > shortest program (Kolmogorov complexity), but it is not exactly the same. > The difference is small enough that we may neglect it, just as we neglect > differences that depend on choice of language. > The infinite number of programs that could output the infinite number of strings that are to be considered (for example while using Solomonoff induction to "predict" what string is being output) lays out the potential for the diagonal argument. You can't assume a priori that the diagonal argument is not relevant. I don't believe that you can prove that it isn't relevant since as you say, Kolmogorov Complexity is not computable, and you cannot be sure that you have listed all the programs that were able to output a particular string. This creates a situation in which the underlying logic of using Solmonoff induction is based on incomputable reasoning which can be shown using the diagonal argument. This kind of criticism cannot be answered with the kinds of presumptions that you used to derive the conclusions that you did. It has to be answered directly. I can think of other infinity to infinity relations in which the potential mappings can be countably derived from the formulas or equations, but I have yet to see any analysis which explains why this usage can be. Although you may imagine that the summation of the probabilities can be used just like it was an ordinary number, the unchecked usage is faulty. In other words the criticism has to be considered more carefully by someone capable of dealing with complex mathematical problems that involve the legitimacy of claims between infinite to infinite mappings. Jim Bromer On Fri, Jul 2, 2010 at 6:08 PM, Matt Mahoney <[email protected]> wrote: > Jim, to address all of your points, > > Solomonoff induction claims that the probability of a string is > proportional to the number of programs that output the string, where each > program M is weighted by 2^-|M|. The probability is dominated by the > shortest program (Kolmogorov complexity), but it is not exactly the same. > The difference is small enough that we may neglect it, just as we neglect > differences that depend on choice of language. > > Here is the proof that Kolmogorov complexity is not computable. Suppose it > were. Then I could test the Kolmogorov complexity of strings in increasing > order of length (breaking ties lexicographically) and describe "the first > string that cannot be described in less than a million bits", contradicting > the fact that I just did. (Formally, I could write a program that outputs > the first string whose Kolmogorov complexity is at least n bits, choosing n > to be larger than my program). > > Here is the argument that Occam's Razor and Solomonoff distribution must be > true. Consider all possible probability distributions p(x) over any infinite > set X of possible finite strings x, i.e. any X = {x: p(x) > 0} that is > infinite. All such distributions must favor shorter strings over longer > ones. Consider any x in X. Then p(x) > 0. There can be at most a finite > number (less than 1/p(x)) of strings that are more likely than x, and > therefore an infinite number of strings which are less likely than x. Of > this infinite set, only a finite number (less than 2^|x|) can be shorter > than x, and therefore there must be an infinite number that are longer than > x. So for each x we can partition X into 4 subsets as follows: > > - shorter and more likely than x: finite > - shorter and less likely than x: finite > - longer and more likely than x: finite > - longer and less likely than x: infinite. > > So in this sense, any distribution over the set of strings must favor > shorter strings over longer ones. > > > -- Matt Mahoney, [email protected] > > > ------------------------------ > *From:* Jim Bromer <[email protected]> > *To:* agi <[email protected]> > *Sent:* Fri, July 2, 2010 4:09:38 PM > > *Subject:* Re: [agi] Re: Huge Progress on the Core of AGI > > > > On Fri, Jul 2, 2010 at 2:25 PM, Jim Bromer <[email protected]> wrote: >> >> There cannot be a one to one correspondence to the representation of >>> the shortest program to produce a string and the strings that they produce. >>> This means that if the consideration of the hypotheses were to be put into >>> general mathematical form it must include the potential of many to one >>> relations between candidate programs (or subprograms) and output strings. >>> >> > >> But, there is also no way to determine what the "shortest" program is, >> since there may be different programs that are the same length. That means >> that there is a many to one relation between programs and program "length". >> So the claim that you could just iterate through programs *by length* is >> false. This is the goal of algorithmic information theory not a premise >> of a methodology that can be used. So you have the diagonalization problem. >> > > > A counter argument is that there are only a finite number of Turing Machine > programs of a given length. However, since you guys have specifically > designated that this theorem applies to any construction of a Turing Machine > it is not clear that this counter argument can be used. And there is still > the specific problem that you might want to try a program that writes a > longer program to output a string (or many strings). Or you might want to > write a program that can be called to write longer programs on a dynamic > basis. I think these cases, where you might consider a program that outputs > a longer program, (or another instruction string for another Turing > Machine) constitutes a serious problem, that at the least, deserves to be > answered with sound analysis. > > Part of my original intuitive argument, that I formed some years ago, was > that without a heavy constraint on the instructions for the program, it will > be practically impossible to test or declare that some program is indeed the > shortest program. However, I can't quite get to the point now that I can > say that there is definitely a diagonalization problem. > > Jim Bromer > > *agi* | Archives <https://www.listbox.com/member/archive/303/=now> > <https://www.listbox.com/member/archive/rss/303/> | > Modify<https://www.listbox.com/member/?&>Your Subscription > <http://www.listbox.com/> > *agi* | Archives <https://www.listbox.com/member/archive/303/=now> > <https://www.listbox.com/member/archive/rss/303/> | > Modify<https://www.listbox.com/member/?&>Your Subscription > <http://www.listbox.com/> > ------------------------------------------- agi Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: https://www.listbox.com/member/?member_id=8660244&id_secret=8660244-6e7fb59c Powered by Listbox: http://www.listbox.com
