Such an example is no where near sufficient to accept the assertion that program size is the right way to define simplicity of a hypothesis.
Here is a counter example. It requires a slightly more complex example because all zeros doesn't leave any room for alternative hypotheses. Here is the sequence: 10, 21, 32 void hypothesis_1() { int ten = 10; int counter = 0; while (1) { print(ten+counter) ten = ten + 10; counter = counter + 1; } } void hypothesis_2() { while (1) print("10 21 32") } Hypothesis 2 is simpler, yet clearly wrong. These examples don't really show anything. Dave On Tue, Jun 29, 2010 at 3:15 PM, Matt Mahoney <matmaho...@yahoo.com> wrote: > David Jones wrote: > > I really don't think this is the right way to calculate simplicity. > > I will give you an example, because examples are more convincing than > proofs. > > Suppose you perform a sequence of experiments whose outcome can either be 0 > or 1. In the first 10 trials you observe 0000000000. What do you expect to > observe in the next trial? > > Hypothesis 1: the outcome is always 0. > Hypothesis 2: the outcome is 0 for the first 10 trials and 1 thereafter. > > Hypothesis 1 is shorter than 2, so it is more likely to be correct. > > If I describe the two hypotheses in French or Chinese, then 1 is still > shorter than 2. > > If I describe the two hypotheses in C, then 1 is shorter than 2. > > void hypothesis_1() { > while (1) printf("0"); > } > > void hypothesis_2() { > int i; > for (i=0; i<10; ++i) printf("0"); > while (1) printf("1"); > } > > If I translate these programs into Perl or Lisp or x86 assembler, then 1 > will still be shorter than 2. > > I realize there might be smaller equivalent programs. But I think you could > find a smaller program equivalent to hypothesis_1 than hypothesis_2. > > I realize there are other hypotheses than 1 or 2. But I think that the > smallest one you can find that outputs eleven bits of which the first ten > are zeros will be a program that outputs another zero. > > I realize that you could rewrite 1 so that it is longer than 2. But it is > the shortest version that counts. More specifically consider all programs in > which the first 10 outputs are 0. Then weight each program by 2^-length. So > the shortest programs dominate. > > I realize you could make up a language where the shortest encoding of > hypothesis 2 is shorter than 1. You could do this for any pair of > hypotheses. However, I think if you stick to "simple" languages (and I > realize this is a circular definition), then 1 will usually be shorter than > 2. > > > -- Matt Mahoney, matmaho...@yahoo.com > > > ------------------------------ > *From:* David Jones <davidher...@gmail.com> > *To:* agi <agi@v2.listbox.com> > *Sent:* Tue, June 29, 2010 1:31:01 PM > > *Subject:* Re: [agi] Re: Huge Progress on the Core of AGI > > > > On Tue, Jun 29, 2010 at 11:26 AM, Matt Mahoney <matmaho...@yahoo.com>wrote: > >> > Right. But Occam's Razor is not complete. It says simpler is better, but >> 1) this only applies when two hypotheses have the same explanatory power and >> 2) what defines simpler? >> >> A hypothesis is a program that outputs the observed data. It "explains" >> the data if its output matches what is observed. The "simpler" hypothesis is >> the shorter program, measured in bits. >> > > I can't be confident that bits is the right way to do it. I suspect bits is > an approximation of a more accurate method. I also suspect that you can > write a more complex explanation "program" with the same number of bits. So, > there are some flaws with this approach. It is an interesting idea to > consider though. > > >> >> The language used to describe the data can be any Turing complete >> programming language (C, Lisp, etc) or any natural language such as >> English. It does not matter much which language you use, because for any two >> languages there is a fixed length procedure, described in either of the >> languages, independent of the data, that translates descriptions in one >> language to the other. >> > > Hypotheses don't have to be written in actual computer code and probably > shouldn't be because hypotheses are not really meant to be "run" per say. > And outputs are not necessarily the right way to put it either. Outputs > imply prediction. And as mike has often pointed out, things cannot be > precisely predicted. We can, however, determine whether a particular > observation fits expectations, rather than equals some prediction. There may > be multiple possible outcomes that we expect and which would be consistent > with a hypothesis, which is why actual prediction should not be used. > > > For example, the simplest hypothesis for all visual interpretation is >> that everything in the first image is gone in the second image, and >> everything in the second image is a new object. Simple. Done. Solved :) >> right? >> >> The hypothesis is not the simplest. The program that outputs the two >> frames as if independent cannot be smaller than the two frames compressed >> independently. The program could be made smaller if it only described how >> the second frame is different than the first. It would be more likely to >> correctly predict the third frame if it continued to run and described how >> it would be different than the second frame. >> > > I really don't think this is the right way to calculate simplicity. > > >> >> > I don't think much progress has been made in this area, but I'd like to >> know what other people have done and any successes they've had. >> >> Kolmogorov proved that the solution is not computable. Given a hypothesis >> (a description of the observed data, or a program that outputs the observed >> data), there is no general procedure or test to determine whether a shorter >> (simpler, better) hypothesis exists. Proof: suppose there were. Then I could >> describe "the first data set that cannot be described in less than a million >> bits" even though I just did. (By "first" I mean the first data set encoded >> by a string from shortest to longest, breaking ties lexicographically). >> > > You are making a different point than the question I asked. It may not be > computable to determine whether a better hypothesis exists, but that is not > what I want to know. What I want to know is what is the better of any two > hypotheses I can come up with. Creating the hypotheses requires a different > algorithm, which is not as hard to me as the other parts of the problem. > > >> >> That said, I believe the state of the art in both language and vision are >> based on hierarchical neural models, i.e. pattern recognition using learned >> weighted combinations of simpler patterns. I am more familiar with language. >> The top ranked programs can be found at >> http://mattmahoney.net/dc/text.html >> > > state of the art in explanatory reasoning is what I'm looking for. > > >> >> >> -- Matt Mahoney, matmaho...@yahoo.com >> >> >> ------------------------------ >> *From:* David Jones <davidher...@gmail.com> >> *To:* agi <agi@v2.listbox.com> >> *Sent:* Tue, June 29, 2010 10:44:41 AM >> *Subject:* Re: [agi] Re: Huge Progress on the Core of AGI >> >> Thanks Matt, >> >> Right. But Occam's Razor is not complete. It says simpler is better, but >> 1) this only applies when two hypotheses have the same explanatory power and >> 2) what defines simpler? >> >> So, maybe what I want to know from the state of the art in research is: >> >> 1) how precisely do other people define "simpler" >> and >> 2) More importantly, how do you compare competing explanations/hypotheses >> that have more or less explanatory power. Simpler does not apply unless you >> are comparing equally explanatory hypotheses. >> >> For example, the simplest hypothesis for all visual interpretation is that >> everything in the first image is gone in the second image, and everything in >> the second image is a new object. Simple. Done. Solved :) right? Well, >> clearly a more complicated explanation is warranted because a more >> complicated explanation is more *explanatory* and a better explanation. So, >> why is it better? Can it be defined as better in a precise way so that you >> can compare arbitrary hypotheses or explanations? That is what I'm trying to >> learn about. I don't think much progress has been made in this area, but I'd >> like to know what other people have done and any successes they've had. >> >> Dave >> >> >> On Tue, Jun 29, 2010 at 10:29 AM, Matt Mahoney <matmaho...@yahoo.com>wrote: >> >>> David Jones wrote: >>> > If anyone has any knowledge of or references to the state of the art in >>> explanation-based reasoning, can you send me keywords or links? >>> >>> The simplest explanation of the past is the best predictor of the future. >>> http://en.wikipedia.org/wiki/Occam's_razor<http://en.wikipedia.org/wiki/Occam%27s_razor> >>> <http://en.wikipedia.org/wiki/Occam%27s_razor> >>> http://www.scholarpedia.org/article/Algorithmic_probability >>> <http://www.scholarpedia.org/article/Algorithmic_probability> >>> >>> -- Matt Mahoney, matmaho...@yahoo.com >>> >>> >>> ------------------------------ >>> *From:* David Jones <davidher...@gmail.com> >>> >>> *To:* agi <agi@v2.listbox.com> >>> *Sent:* Tue, June 29, 2010 9:05:45 AM >>> *Subject:* [agi] Re: Huge Progress on the Core of AGI >>> >>> If anyone has any knowledge of or references to the state of the art in >>> explanation-based reasoning, can you send me keywords or links? I've read >>> some through google, but I'm not really satisfied with anything I've found. >>> >>> Thanks, >>> >>> Dave >>> >>> On Sun, Jun 27, 2010 at 1:31 AM, David Jones <davidher...@gmail.com>wrote: >>> >>>> A method for comparing hypotheses in explanatory-based reasoning: * >>>> >>>> We prefer the hypothesis or explanation that ***expects* more >>>> observations. If both explanations expect the same observations, then the >>>> simpler of the two is preferred (because the unnecessary terms of the more >>>> complicated explanation do not add to the predictive power).* >>>> >>>> *Why are expected events so important?* They are a measure of 1) >>>> explanatory power and 2) predictive power. The more predictive and the more >>>> explanatory a hypothesis is, the more likely the hypothesis is when >>>> compared >>>> to a competing hypothesis. >>>> >>>> Here are two case studies I've been analyzing from sensory perception of >>>> simplified visual input: >>>> The goal of the case studies is to answer the following: How do you >>>> generate the most likely motion hypothesis in a way that is general and >>>> applicable to AGI? >>>> *Case Study 1)* Here is a link to an example: animated gif of two black >>>> squares move from left to >>>> right<http://practicalai.org/images/CaseStudy1.gif>. >>>> *Description: *Two black squares are moving in unison from left to >>>> right across a white screen. In each frame the black squares shift to the >>>> right so that square 1 steals square 2's original position and square two >>>> moves an equal distance to the right. >>>> *Case Study 2) *Here is a link to an example: the interrupted >>>> square<http://practicalai.org/images/CaseStudy2.gif>. >>>> *Description:* A single square is moving from left to right. Suddenly >>>> in the third frame, a single black square is added in the middle of the >>>> expected path of the original black square. This second square just stays >>>> there. So, what happened? Did the square moving from left to right keep >>>> moving? Or did it stop and then another square suddenly appeared and moved >>>> from left to right? >>>> >>>> *Here is a simplified version of how we solve case study 1: >>>> *The important hypotheses to consider are: >>>> 1) the square from frame 1 of the video that has a very close position >>>> to the square from frame 2 should be matched (we hypothesize that they are >>>> the same square and that any difference in position is motion). So, what >>>> happens is that in each two frames of the video, we only match one square. >>>> The other square goes unmatched. >>>> 2) We do the same thing as in hypothesis #1, but this time we also match >>>> the remaining squares and hypothesize motion as follows: the first square >>>> jumps over the second square from left to right. We hypothesize that this >>>> happens over and over in each frame of the video. Square 2 stops and square >>>> 1 jumps over it.... over and over again. >>>> 3) We hypothesize that both squares move to the right in unison. This is >>>> the correct hypothesis. >>>> >>>> So, why should we prefer the correct hypothesis, #3 over the other two? >>>> >>>> Well, first of all, #3 is correct because it has the most explanatory >>>> power of the three and is the simplest of the three. Simpler is better >>>> because, with the given evidence and information, there is no reason to >>>> desire a more complicated hypothesis such as #2. >>>> >>>> So, the answer to the question is because explanation #3 expects the >>>> most observations, such as: >>>> 1) the consistent relative positions of the squares in each frame are >>>> expected. >>>> 2) It also expects their new positions in each from based on velocity >>>> calculations. >>>> 3) It expects both squares to occur in each frame. >>>> >>>> Explanation 1 ignores 1 square from each frame of the video, because it >>>> can't match it. Hypothesis #1 doesn't have a reason for why the a new >>>> square >>>> appears in each frame and why one disappears. It doesn't expect these >>>> observations. In fact, explanation 1 doesn't expect anything that happens >>>> because something new happens in each frame, which doesn't give it a chance >>>> to confirm its hypotheses in subsequent frames. >>>> >>>> The power of this method is immediately clear. It is general and it >>>> solves the problem very cleanly. >>>> >>>> *Here is a simplified version of how we solve case study 2:* >>>> We expect the original square to move at a similar velocity from left to >>>> right because we hypothesized that it did move from left to right and we >>>> calculated its velocity. If this expectation is confirmed, then it is more >>>> likely than saying that the square suddenly stopped and another started >>>> moving. Such a change would be unexpected and such a conclusion would be >>>> unjustifiable. >>>> >>>> I also believe that explanations which generate fewer incorrect >>>> expectations should be preferred over those that more incorrect >>>> expectations. >>>> >>>> The idea I came up with earlier this month regarding high frame rates to >>>> reduce uncertainty is still applicable. It is important that all generated >>>> hypotheses have as low uncertainty as possible given our constraints and >>>> resources available. >>>> >>>> I thought I'd share my progress with you all. I'll be testing the ideas >>>> on test cases such as the ones I mentioned in the coming days and weeks. >>>> >>>> Dave >>>> >>> >>> *agi* | Archives <https://www.listbox.com/member/archive/303/=now> >>> <https://www.listbox.com/member/archive/rss/303/> | >>> Modify<https://www.listbox.com/member/?&>Your Subscription >>> <http://www.listbox.com> >>> *agi* | Archives <https://www.listbox.com/member/archive/303/=now> >>> <https://www.listbox.com/member/archive/rss/303/> | >>> Modify<https://www.listbox.com/member/?&>Your Subscription >>> <http://www.listbox.com> >>> >> >> *agi* | Archives <https://www.listbox.com/member/archive/303/=now> >> <https://www.listbox.com/member/archive/rss/303/> | >> Modify<https://www.listbox.com/member/?&>Your Subscription >> <http://www.listbox.com> >> *agi* | Archives <https://www.listbox.com/member/archive/303/=now> >> <https://www.listbox.com/member/archive/rss/303/> | >> Modify<https://www.listbox.com/member/?&>Your Subscription >> <http://www.listbox.com> >> > > *agi* | Archives <https://www.listbox.com/member/archive/303/=now> > <https://www.listbox.com/member/archive/rss/303/> | > Modify<https://www.listbox.com/member/?&>Your Subscription > <http://www.listbox.com> > *agi* | Archives <https://www.listbox.com/member/archive/303/=now> > <https://www.listbox.com/member/archive/rss/303/> | > Modify<https://www.listbox.com/member/?&>Your Subscription > <http://www.listbox.com> > ------------------------------------------- agi Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: https://www.listbox.com/member/?member_id=8660244&id_secret=8660244-6e7fb59c Powered by Listbox: http://www.listbox.com