Jim, >From the article Matt linked to, specifically see the line:
"As [image: p] is itself a binary string, we can define the discrete universal a priori probability, [image: m(x)], to be the probability that the output of a universal prefix Turing machine [image: U] is [image: x]when provided with fair coin flips on the input tape." --Abram On Fri, Aug 6, 2010 at 3:38 PM, Matt Mahoney <[email protected]> wrote: > Jim, see http://www.scholarpedia.org/article/Algorithmic_probability > <http://www.scholarpedia.org/article/Algorithmic_probability>I think this > answers your questions. > > > -- Matt Mahoney, [email protected] > > > ------------------------------ > *From:* Jim Bromer <[email protected]> > *To:* agi <[email protected]> > *Sent:* Fri, August 6, 2010 2:18:09 PM > > *Subject:* Re: [agi] Comments On My Skepticism of Solomonoff Induction > > I meant: > Did Solomonoff's original idea use randomization to determine the bits of > the programs that are used to produce the *prior probabilities*? I think > that the answer to that is obviously no. The randomization of the next bit > would used in the test of the prior probabilities as done using a random > sampling. He probably found that students who had some familiarity with > statistics would initially assume that the prior probability was based on > some subset of possible programs as would be expected from a typical sample, > so he gave this statistics type of definition to emphasize the extent of > what he had in mind. > > I asked this question just to make sure that I understood what Solomonoff > Induction was, because Abram had made some statement indicating that I > really didn't know. Remember, this particular branch of the discussion was > originally centered around the question of whether Solomonoff > Induction would be convergent, even given a way around the incomputability > of finding only those programs that halted. So while the random testing of > the prior probabilities is of interest to me, I wanted to make sure that > there is no evidence that Solomonoff Induction is convergent. I am not being > petty about this, but I also needed to make sure that I understood what > Solomonoff Induction is. > > I am interested in hearing your ideas about your variation of > Solomonoff Induction because your convergent series, in this context, was > interesting. > Jim Bromer > > On Fri, Aug 6, 2010 at 6:50 AM, Jim Bromer <[email protected]> wrote: > >> Jim: So, did Solomonoff's original idea involve randomizing whether the >>> next bit would be a 1 or a 0 in the program? >> >> Abram: Yep. >> I meant, did Solomonoff's original idea involve randomizing whether the >> next bit in the program's that are originally used to produce the *prior >> probabilities* involve the use of randomizing whether the next bit would >> be a 1 or a 0? I have not been able to find any evidence that it was. >> I thought that my question was clear but on second thought I guess it >> wasn't. I think that the part about the coin flips was only a method to >> express that he was interested in the probability that a particular string >> would be produced from all possible programs, so that when actually testing >> the prior probability of a particular string the program that was to be run >> would have to be randomly generated. >> Jim Bromer >> >> >> >> >> On Wed, Aug 4, 2010 at 10:27 PM, Abram Demski <[email protected]>wrote: >> >>> Jim, >>> >>> Your function may be convergent but it is not a probability. >>>> >>> >>> True! All the possibilities sum to less than 1. There are ways of >>> addressing this (ie, multiply by a normalizing constant which must also be >>> approximated in a convergent manner), but for the most part adherents of >>> Solomonoff induction don't worry about this too much. What we care about, >>> mostly, is comparing different hyotheses to decide which to favor. The >>> normalizing constant doesn't help us here, so it usually isn't mentioned. >>> >>> >>> You said that Solomonoff's original construction involved flipping a coin >>>> for the next bit. What good does that do? >>> >>> >>> Your intuition is that running totally random programs to get predictions >>> will just produce garbage, and that is fine. The idea of Solomonoff >>> induction, though, is that it will produce systematically less garbage than >>> just flipping coins to get predictions. Most of the garbage programs will be >>> knocked out of the running by the data itself. This is supposed to be the >>> least garbage we can manage without domain-specific knowledge >>> >>> This is backed up with the proof of dominance, which we haven't talked >>> about yet, but which is really the key argument for the optimality of >>> Solomonoff induction. >>> >>> >>> And how does that prove that his original idea was convergent? >>> >>> >>> The proofs of equivalence between all the different formulations of >>> Solomonoff induction are something I haven't cared to look into too deeply. >>> >>> Since his idea is incomputable, there are no algorithms that can be run >>>> to demonstrate what he was talking about so the basic idea is papered with >>>> all sorts of unverifiable approximations. >>> >>> >>> I gave you a proof of convergence for one such approximation, and if you >>> wish I can modify it to include a normalizing constant to ensure that it is >>> a probability measure. It would be helpful to me if your criticisms were >>> more specific to that proof. >>> >>> So, did Solomonoff's original idea involve randomizing whether the next >>>> bit would be a 1 or a 0 in the program? >>>> >>> >>> Yep. >>> >>> Even ignoring the halting problem what kind of result would that give? >>>> >>> >>> Well, the general idea is this. An even distribution intuitively >>> represents lack of knowledge. An even distribution over possible data fails >>> horribly, however, predicting white noise. We want to represent the idea >>> that we are very ignorant of what the data might be, but not *that* >>> ignorant. To capture the idea of regularity, ie, similarity between past and >>> future, we instead take an even distribution over *descriptions* of the >>> data. (The distribution in the 2nd version of solomonoff induction that I >>> gave, the one in which the space of possible programs is represented as a >>> continuum, is an even distribution.) This appears to provide a good amount >>> of regularity without too much. >>> >>> --Abram >>> >>> On Wed, Aug 4, 2010 at 8:10 PM, Jim Bromer <[email protected]> wrote: >>> >>>> Abram, >>>> Thanks for the explanation. I still don't get it. Your function may be >>>> convergent but it is not a probability. You said that Solomonoff's >>>> original >>>> construction involved flipping a coin for the next bit. What good does >>>> that >>>> do? And how does that prove that his original idea was convergent? The >>>> thing that is wrong with these explanations is that they are not coherent. >>>> Since his idea is incomputable, there are no algorithms that can be run to >>>> demonstrate what he was talking about so the basic idea is papered with all >>>> sorts of unverifiable approximations. >>>> >>>> So, did Solomonoff's original idea involve randomizing whether the next >>>> bit would be a 1 or a 0 in the program? Even ignoring the halting >>>> problem what kind of result would that give? Have you ever solved the >>>> problem for some strings and have you ever tested the solutions using a >>>> simulation? >>>> >>>> Jim Bromer >>>> >>>> On Mon, Aug 2, 2010 at 5:12 PM, Abram Demski <[email protected]>wrote: >>>> >>>>> Jim, >>>>> >>>>> Interestingly, the formalization of Solomonoff induction I'm most >>>>> familiar with uses a construction that relates the space of programs with >>>>> the real numbers just as you say. This formulation may be due to >>>>> Solomonoff, >>>>> or perhaps Hutter... not sure. I re-formulated it to "gloss over" that in >>>>> order to make it simpler; I'm pretty sure the version I gave is equivalent >>>>> in the relevant aspects. However, some notes on the original construction. >>>>> >>>>> Programs are created by flipping coins to come up with the 1s and 0s. >>>>> We are to think of it like this: whenever the computer reaches the end of >>>>> the program and tries to continue on, we flip a coin to decide what the >>>>> next >>>>> bit of the program will be. We keep doing this for as long as the computer >>>>> wants more bits of instruction. >>>>> >>>>> This framework makes room for infinitely long programs, but makes them >>>>> infinitely improbable-- formally, they have probability 0. (We could alter >>>>> the setup to allow them an infinitesimal probability.) Intuitively, this >>>>> means that if we keep flipping a coin to tell the computer what to do, >>>>> eventually we will either create an infinite loop-back (so the computer >>>>> keeps executing the already-written parts of the program and never asks >>>>> for >>>>> more) or write out the "HALT" command. Avoiding doing one or the other >>>>> forever is just too improbable. >>>>> >>>>> This also means all real numbers are output by some program! It just >>>>> may be one which is infinitely long. >>>>> >>>>> However, all the programs that "slip past" my time bound as T increases >>>>> to infinity will have measure 0, meaning they don't add anything to the >>>>> sum. >>>>> This means the convergence is unaffected. >>>>> >>>>> Note: in this construction, program space is *still* a well-defined >>>>> entity. >>>>> >>>>> --Abram >>>>> >>>>> On Sun, Aug 1, 2010 at 9:05 PM, Jim Bromer <[email protected]>wrote: >>>>> >>>>>> Abram, >>>>>> >>>>>> This is a very interesting function. I have spent a lot of time >>>>>> thinking about it. However, I do not believe that does, in any way, >>>>>> prove or indicate that Solomonoff Induction is convergent. I want to >>>>>> discuss the function but I need to take more time to study some stuff >>>>>> and to >>>>>> work various details out. (Although I have thought a lot about it, I am >>>>>> writing this under a sense of deadline, so it may not be well composed.) >>>>>> >>>>>> >>>>>> >>>>>> My argument was that Solomonoff's conjecture, which was based (as far >>>>>> as I can tell) on 'all possible programs', was fundamentally flawed >>>>>> because >>>>>> the idea of 'all possible programs' is not a programmable definition. >>>>>> All possible programs is a domain, not a class of programs that can be >>>>>> feasibly defined in the form of an algorithm that could 'reach' all >>>>>> the programs. >>>>>> >>>>>> >>>>>> >>>>>> The domain of all possible programs is trans-infinite just as the >>>>>> domain of irrational numbers are. Why do I believe this? Because if >>>>>> we imagine that infinite algorithms are computable, then we could compute >>>>>> irrational numbers. That is, there are programs that, given infinite >>>>>> resources, could compute irrational numbers. We can use the binomial >>>>>> theorem, for example to compute the square root of 2. And we can use >>>>>> trial and error methods to compute the nth root of any number. So >>>>>> that proves that there are infinite irrational numbers that can be >>>>>> computed >>>>>> by algorithms that run for infinity. >>>>>> >>>>>> >>>>>> >>>>>> So what does this have to do with Solomonoff's conjecture of all >>>>>> possible programs? Well, if I could prove that any individual >>>>>> irrational number could be computed (with programs that ran through >>>>>> infinity) then I might be able to prove that there are trans-infinite >>>>>> programs. If I could prove that some trans-infinite subset of >>>>>> irrational numbers could be computed then I might be able to prove that >>>>>> 'all >>>>>> possible programs' is a trans-infinite class. >>>>>> >>>>>> >>>>>> Now Abram said that since his sum, based on runtime and program >>>>>> length, is convergent it can prove that Solomonoff Induction is >>>>>> convergent. >>>>>> Even assuming that his convergent sum method could be fixed up a >>>>>> little, I suspect that this time-length bound is misleading. Since a >>>>>> Turing Machine allows for erasures this means that a program could last >>>>>> longer than his time parameter and still produce an output string that >>>>>> matches the given string. And if 'all possible programs' is a >>>>>> trans-infinite class then there are programs that you are going to miss. >>>>>> Your encoding method will miss trans-infinite programs (unless you have >>>>>> trans-cended the trans-infinite.) >>>>>> >>>>>> However, I want to study the function and some other ideas related to >>>>>> this kind of thing a little more. It is an interesting function. >>>>>> Unfortunately I also have to get back to other-worldly things. >>>>>> >>>>>> Jim Bromer >>>>>> >>>>>> >>>>>> On Mon, Jul 26, 2010 at 2:54 PM, Abram Demski >>>>>> <[email protected]>wrote: >>>>>> >>>>>>> Jim, >>>>>>> >>>>>>> I'll argue that solomonoff probabilities are in fact like Pi, that >>>>>>> is, computable in the limit. >>>>>>> >>>>>>> I still do not understand why you think these combinations are >>>>>>> necessary. It is not necessary to make some sort of ordering of the sum >>>>>>> to >>>>>>> get it to converge: ordering only matters for infinite sums which >>>>>>> include >>>>>>> negative numbers. (Perhaps that's where you're getting the idea?) >>>>>>> >>>>>>> Here's my proof, rewritten from an earlier post, using the properties >>>>>>> of infinite sums of non-negative numbers. >>>>>>> >>>>>>> (preliminaries) >>>>>>> >>>>>>> Define the computation as follows: we start with a string S which we >>>>>>> want to know the Solomonoff probability of. We are given a time-limit >>>>>>> T. We >>>>>>> start with P=0, where P is a real number with precision 2*log_4(T) or >>>>>>> more. >>>>>>> We use some binary encoding for programs which (unlike normal >>>>>>> programming >>>>>>> languages) does not contain syntactically invalid programs, but still >>>>>>> will >>>>>>> (of course) contain infinite loops and so on. We run program "0" and >>>>>>> "1" for >>>>>>> T/4 each, "00", "01", "10" and "11" for T/16 each, and in general run >>>>>>> each >>>>>>> program of length N for floor[T/(4^N)] until T/(4^N) is less than 1. >>>>>>> Each >>>>>>> time we run a program and the result is S, we add 1/(4^N) to P. >>>>>>> >>>>>>> (assertion) >>>>>>> >>>>>>> P converges to some value as T is increased. >>>>>>> >>>>>>> (proof) >>>>>>> >>>>>>> If every single program were to output S, then T would converge to >>>>>>> 1/4 + 1/4 + 1/16 + 1/16 + 1/16 + 1/16 + ... that is, 2*(1/(4^1)) + >>>>>>> 4*(1/(4^2)) + 8*(1/(4^3)) + ... which comes to 1/2 + 1/4 + 1/8 + 1/16 + >>>>>>> .. >>>>>>> i.e. 1/(2^1) + 1/(2^2) + 1/(2^3) + ... ; it is well-known that this >>>>>>> sequence >>>>>>> converges to 1. Thus 1 is an upper bound for P: it could only get that >>>>>>> high >>>>>>> if every single program were to output S. >>>>>>> >>>>>>> 0 is a lower bound, since we start there and never subtract anything. >>>>>>> In fact, every time we add more to P, we have a new lower bound: P will >>>>>>> never go below a number once it reaches it. The sum can only increase. >>>>>>> Infinite sums with this property must either converge to a finite >>>>>>> number, or >>>>>>> go to infinity. However, we already know that 1 is an upper bound for >>>>>>> P; so, >>>>>>> it cannot go to infinity. Hence, it must converge. >>>>>>> >>>>>>> --Abram >>>>>>> >>>>>>> On Mon, Jul 26, 2010 at 9:14 AM, Jim Bromer >>>>>>> <[email protected]>wrote: >>>>>>> >>>>>>>> As far as I can tell right now, my theories that Solomonoff >>>>>>>> Induction is trans-infinite were wrong. Now that I realize that the >>>>>>>> mathematics do not support these conjectures, I have to acknowledge >>>>>>>> that I >>>>>>>> would not be able to prove or even offer a sketch of a proof of my >>>>>>>> theories. Although I did not use rigourous mathematics while I have >>>>>>>> tried >>>>>>>> to make an assessment of the Solomonoff method, the first principle of >>>>>>>> rigourous mathematics is to acknowledge that the mathematics does not >>>>>>>> support your supposition when they don't. >>>>>>>> >>>>>>>> Solomonoff Induction isn't a mathematical theory because the desired >>>>>>>> results are not computable. As I mentioned before, there are a great >>>>>>>> many >>>>>>>> functions that are not computable but which are useful and important >>>>>>>> because >>>>>>>> they tend toward a limit which can be seen in with a reasonable number >>>>>>>> of >>>>>>>> calculations using the methods available. Pi is one such function. >>>>>>>> (I am >>>>>>>> presuming that pi would require an infinite expansion which seems >>>>>>>> right.) >>>>>>>> >>>>>>>> I have explained, and I think it is a correct explanation, that >>>>>>>> there is no way that you could make an apriori computation of all >>>>>>>> possible >>>>>>>> combinations taken from infinite values. So you could not even come >>>>>>>> up with >>>>>>>> a theoretical construct that could take account of that level of >>>>>>>> complexity. It is true that you could come up with a theoretical >>>>>>>> computational method that could take account of any finite number of >>>>>>>> values, >>>>>>>> and that is what we are talking about when we talk about the infinite, >>>>>>>> but >>>>>>>> in this context it only points to a theoretical paradox. Your >>>>>>>> theoretical >>>>>>>> solution could not take the final step of computing a probability for a >>>>>>>> string until it had run through the infinite combinations and this is >>>>>>>> impossible. The same problem does not occur for pi because the >>>>>>>> function >>>>>>>> that produces it tends toward a limit. >>>>>>>> >>>>>>>> The reason I thought Solomonoff Induction was trans-infinite was >>>>>>>> because I thought it used the Bayesian notion to compute the >>>>>>>> probability using all possible programs that produced a particular >>>>>>>> substring >>>>>>>> following a given prefix. Now I understand that the desired function >>>>>>>> is the >>>>>>>> computation of only the probability of a particular string (not all >>>>>>>> possible >>>>>>>> substrings that are identical to the string) following a given prefix. >>>>>>>> I >>>>>>>> want to study the method a little further during the next few weeks, >>>>>>>> but I >>>>>>>> just wanted to make it clear that, as far as now understand the >>>>>>>> program, >>>>>>>> that I do not think that it is trans-infinite. >>>>>>>> >>>>>>>> Jim Bromer >>>>>>>> >>>>>>> > *agi* | Archives <https://www.listbox.com/member/archive/303/=now> > <https://www.listbox.com/member/archive/rss/303/> | > Modify<https://www.listbox.com/member/?&>Your Subscription > <http://www.listbox.com> > *agi* | Archives <https://www.listbox.com/member/archive/303/=now> > <https://www.listbox.com/member/archive/rss/303/> | > Modify<https://www.listbox.com/member/?&>Your Subscription > <http://www.listbox.com> > -- Abram Demski http://lo-tho.blogspot.com/ http://groups.google.com/group/one-logic ------------------------------------------- agi Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: https://www.listbox.com/member/?member_id=8660244&id_secret=8660244-6e7fb59c Powered by Listbox: http://www.listbox.com
