Re: [agi] Re: Huge Progress on the Core of AGI

Matt Mahoney Tue, 29 Jun 2010 14:05:08 -0700

You can always find languages that favor either hypothesis. Suppose that you 
want to predict the sequence 10, 21, 32, ? and we write our hypothesis as a 
function that takes the trial number (0, 1, 2, 3...) and returns the outcome. 
The sequence 10, 21, 32, 43, 54... would be coded:


int hypothesis_1(int trial) {
  return trial*11+10;
}

The sequence 10, 21, 32, 10, 21, 32... would be coded

int hypothesis_2(int trial) {
  return trial%3*11+10;
}

which is longer and therefore less likely.

Here is another example: predict the sequence 0, 1, 4, 9, 16, 25, 36, 49, ?

Can you find a program shorter than this that doesn't predict 64?

int hypothesis_1(int trial) {
  return trial*trial;
}

 -- Matt Mahoney, [email protected]




________________________________
From: David Jones <[email protected]>
To: agi <[email protected]>
Sent: Tue, June 29, 2010 3:48:01 PM
Subject: Re: [agi] Re: Huge Progress on the Core of AGI

Such an example is no where near sufficient to accept the assertion that 
program size is the right way to define simplicity of a hypothesis.

Here is a counter example. It requires a slightly more complex example because 
all zeros doesn't leave any room for alternative hypotheses.

Here is the sequence: 10, 21, 32

void hypothesis_1() {
   int ten = 10;
   int counter = 0;

    while (1)
    {
       print(ten+counter)
       ten = ten + 10;
       counter = counter + 1;
    }

  }

void hypothesis_2() {

    while (1)
       print("10 21 32")

  }

Hypothesis 2 is simpler, yet clearly wrong. These examples don't really show 
anything.

Dave


On Tue, Jun 29, 2010 at 3:15 PM, Matt Mahoney <[email protected]> wrote:

David Jones wrote:
>> I really don't think this is the right way to calculate simplicity. 
>
>
>I will give you an example, because examples are more convincing than proofs.
>
>
>Suppose you perform a sequence of experiments whose outcome can either be 0 or 
>1. In the first 10 trials you observe 0000000000. What do you expect to 
>observe in the next trial?
>
>
>Hypothesis 1: the outcome is always 0.
>Hypothesis 2: the outcome is 0 for the first 10 trials and 1 thereafter.
>
>
>Hypothesis 1 is shorter than 2, so it is more likely to be correct.
>
>
>If I describe
> the two hypotheses in French or Chinese, then 1 is still shorter than 2.
>
>
>If I describe the two hypotheses in C, then 1 is shorter than 2.
>
>
>  void hypothesis_1() {
>>    while (1) printf("0");
>  }
>
>
>  void hypothesis_2() {
>    int i;
>    for (i=0; i<10; ++i) printf("0");
>    while (1) printf("1");
>  }
>
>
>If I translate these programs into Perl or Lisp or x86 assembler, then 1 will 
>still be shorter than 2.
>
>
>I realize there might be smaller equivalent programs. But I think you could 
>find a smaller program equivalent to hypothesis_1 than hypothesis_2.
>
>
>I realize there are other hypotheses than 1 or 2. But I think that the 
>smallest one you can find that outputs
> eleven bits of which the first ten are zeros will be a program that outputs 
> another zero.
>
>
>I realize that you could rewrite 1 so that it is longer than 2. But it is the 
>shortest version that counts. More specifically consider all programs in which 
>the first 10 outputs are 0. Then weight each program by 2^-length. So the 
>shortest programs dominate.
>
>
>I realize you could make up a language where the shortest encoding of 
>hypothesis 2 is shorter than 1. You could do this for any pair of hypotheses. 
>However, I think if you stick to "simple" languages (and I realize this is a 
>circular definition), then 1 will usually be shorter than 2.
>
> -- Matt Mahoney, [email protected]
>
>
>
>
>
________________________________
From: David Jones <[email protected]>
>To: agi <[email protected]>
>Sent: Tue, June 29, 2010 1:31:01 PM
>
>Subject: Re: [agi] Re: Huge Progress on the Core of AGI
>
>
>
>
>
>>On Tue, Jun 29, 2010 at 11:26 AM, Matt Mahoney <[email protected]> wrote:
>
>>>
>>> Right. But Occam's Razor is not complete. It says simpler is better, but 1) 
>>> this only applies when two hypotheses have the same explanatory power and 
>>> 2) what defines simpler? 
>>
>>
>>A hypothesis is a program that outputs the observed data. It "explains" the 
>>data if its output matches what is observed. The "simpler" hypothesis is the 
>>shorter program, measured in bits.
>
>I can't be confident that bits is the right way to do it. I suspect bits is an 
>approximation of a more accurate method. I also suspect that you can write a 
>more complex explanation "program" with the same number of bits. So, there are 
>some flaws with this approach. It is an interesting idea to consider though. 
>>
> 
>
>
>>
>>The language used to describe the data can be any Turing complete programming 
>>language (C, Lisp, etc) or any natural language such as English. It does not 
>>matter much which language you use, because for any two languages there is a 
>>fixed length procedure, described in either of the languages, independent of 
>>the data, that translates descriptions in
>> one language to the other.
>
>Hypotheses don't have to be written in actual computer code and probably 
>shouldn't be because hypotheses are not really meant to be "run" per say. And 
>outputs are not necessarily the right way to put it either. Outputs imply 
>prediction. And as mike has often pointed out, things cannot be precisely 
>predicted. We can, however, determine whether a particular observation fits 
>expectations, rather than equals some prediction. There may be multiple 
>possible outcomes that we expect and which would be consistent with a 
>hypothesis, which is why actual prediction should not be used.
>
>
>> For example, the simplest hypothesis for all visual interpretation is that 
>> everything in the first image is gone in the second image, and everything in 
>> the second image is a new object. Simple. Done. Solved :) right? 
>>
>>
>>The hypothesis is not the simplest. The program that outputs the two frames 
>>as if independent cannot be smaller than the two frames compressed 
>>independently. The program could be made smaller if it only described how the 
>>second frame is different than the first. It would be more likely to 
>>correctly predict the third frame if it continued to run and described how it 
>>would be different than the second frame.
>
>I really don't think this is the right way to calculate simplicity. 
> 
>
>>>
>>
>>
>>> I don't think much progress has been made in this area, but I'd like to 
>>> know what other people have done and any successes they've had.
>>
>>
>>Kolmogorov proved that the solution is not
>> computable. Given a hypothesis (a description of the observed data, or a 
>> program that outputs the observed data), there is no general procedure or 
>> test to determine whether a shorter (simpler, better) hypothesis exists. 
>> Proof: suppose there were. Then I could describe "the first data set that 
>> cannot be described in less than a million bits" even though I just did. (By 
>> "first" I mean the first data set encoded by a string from shortest to 
>> longest, breaking ties lexicographically).
>
>You are making a different point than the question I asked. It may not be 
>computable to determine whether a better hypothesis exists, but that is not 
>what I want to know. What I want to know is what is the better of any two 
>hypotheses I can come up with. Creating the hypotheses requires a different 
>algorithm, which is not as hard to me as the other parts of the problem. 
>>
> 
>
>
>>
>>That said, I believe the state of the art in both language and vision are 
>>based on hierarchical neural models, i.e. pattern recognition using learned 
>>weighted combinations of simpler patterns. I am more familiar with language. 
>>The top ranked programs can be found at http://mattmahoney.net/dc/text.html
>
>state of the art in explanatory reasoning is what I'm looking for.
> 
>
>>>
>>
>> -- Matt Mahoney, [email protected]
>>
>>
>>
>>
>>
________________________________
From: David Jones <[email protected]>
>>To: agi <[email protected]>
>>Sent: Tue, June 29, 2010 10:44:41 AM
>>Subject: Re: [agi] Re: Huge Progress on the Core of AGI
>>
>>
>>Thanks Matt,
>>
>>Right. But Occam's Razor is not complete. It says simpler is better, but 1) 
>>this only applies when two hypotheses have the same explanatory power and 2) 
>>what defines simpler? 
>>
>>So, maybe what I want to know from the state of the art in research is: 
>>
>>1) how precisely do other people define "simpler"
>>and
>>2) More importantly, how do you compare competing explanations/hypotheses 
>>that have more or less explanatory power. Simpler does not apply unless you 
>>are comparing equally explanatory hypotheses. 
>>
>>For example, the simplest hypothesis for all visual interpretation is that 
>>everything in the first image is gone in the second image, and everything in 
>>the second image is a new object. Simple. Done. Solved :) right? Well, 
>>clearly a more complicated explanation is warranted because a more 
>>complicated explanation is more *explanatory* and a better explanation. So, 
>>why is it better? Can it be defined as better in a precise way so that you 
>>can compare arbitrary hypotheses or explanations? That is what I'm trying to 
>>learn about. I don't think much progress has been made in this area, but I'd 
>>like to know what other people have done and any successes they've had.
>>
>>Dave
>>
>>
>>
>>On Tue, Jun 29, 2010 at 10:29 AM, Matt Mahoney <[email protected]> wrote:
>>
>>David Jones wrote:
>>>> If anyone has any knowledge of or references to the state of the art in 
>>>> explanation-based reasoning, can you send me keywords or links? 
>>>
>>>
>>>The simplest explanation of the past is the best predictor of the future.
>>>http://en.wikipedia.org/wiki/Occam's_razor
>>>http://www.scholarpedia.org/article/Algorithmic_probability
>>>
>>> -- Matt Mahoney, [email protected]
>>>>>>
>>>
>>>
>>>
>>>
________________________________
From: David Jones <[email protected]>
>>>>>>
>>>
>>>To: agi <[email protected]>
>>>Sent: Tue, June 29, 2010 9:05:45 AM
>>>Subject: [agi] Re: Huge Progress on the Core of AGI
>>>
>>>
>>>If anyone has any knowledge of or references to the state of the art in 
>>>explanation-based reasoning, can you send me keywords or links? I've read 
>>>some through google, but I'm not really satisfied with anything I've found. 
>>>
>>>Thanks,
>>>
>>>Dave
>>>
>>>
>>>On Sun, Jun 27, 2010 at 1:31 AM, David Jones <[email protected]> wrote:
>>>
>>>>>>>A method for comparing hypotheses in explanatory-based reasoning: 
>>>>
>>>>We prefer the hypothesis or explanation that *expects* more observations. 
>>>>If both explanations expect the same observations, then the simpler of the 
>>>>two is preferred (because the unnecessary terms of the more complicated 
>>>>explanation do not add to the predictive power). 
>>>>
>>>>Why are expected events so important? They are a measure of 1) explanatory 
>>>>power and 2) predictive power. The more predictive and 
>>>>the more explanatory a hypothesis is, the more likely the hypothesis is 
>>>>when compared to a competing hypothesis.
>>>>
>>>>Here are two case studies I've been analyzing from sensory perception of 
>>>>simplified visual input:
>>>>>>>>
>>>>
>>>>
>>>>
>>>>
>>>>The goal of the case studies is to answer the following: How do you 
>>>>generate the most likely motion hypothesis in a way that is 
>>>>general and applicable to AGI?
>>>>Case Study 1) Here is a link to an example: animated gif of two black 
>>>>squares move from left to right. Description: Two black squares are moving 
>>>>in unison from left to right across a white screen. In each frame the black 
>>>>squares shift to the right so that square 1 steals square 2's original 
>>>>position and square two moves an equal distance to the right.
>>>>Case Study 2) Here is a link to an example: the interrupted square. 
>>>>Description: A single square is moving from left to right. Suddenly in the 
>>>>third frame, a single black square is added in the middle of the expected 
>>>>path of the original black square. This second square just stays there. So, 
>>>>what happened? Did the square moving from left to right keep moving? Or did 
>>>>it stop and then another square suddenly appeared and moved from left to 
>>>>right?
>>>>
>>>>Here is a simplified version of how we solve case study 1:
>>>>The important hypotheses to consider are: 
>>>>1) the square from frame 1 of the video that has a very close position to 
>>>>the square from frame 2 should be matched (we hypothesize that they are the 
>>>>same square and that any difference in position is motion).  So, what 
>>>>happens is that in each two frames of the video, we only match one square. 
>>>>The other square goes unmatched.   
>>>>>>>>
>>>>
>>>>
>>>>
>>>>
>>>>2) We do the same thing as in hypothesis #1, but this time we also match 
>>>>the remaining squares and hypothesize motion as follows: the first square 
>>>>jumps over the second square from left to right. We hypothesize that this 
>>>>happens over and over in each frame of the video. Square 2 stops and square 
>>>>1 jumps over it.... over and over again. 
>>>>>>>>
>>>>
>>>>
>>>>
>>>>
>>>>3) We hypothesize that both squares move to the right in unison. This is 
>>>>the correct hypothesis.
>>>>
>>>>So, why should we prefer the correct hypothesis, #3 over the other two?
>>>>
>>>>Well, first of all, #3 is correct because it has the most explanatory power 
>>>>of the three and is the simplest of the three. Simpler is better because, 
>>>>with the given evidence and information, there is no reason to desire a 
>>>>more complicated hypothesis such as #2. 
>>>>
>>>>So, the answer to the question is because explanation #3 expects the most 
>>>>observations, such as: 
>>>>1) the consistent relative positions of the squares in each frame are 
>>>>expected. 
>>>>2) It also expects their new positions in each from based on velocity 
>>>>calculations. 
>>>>>>>>
>>>>
>>>>
>>>>
>>>>
>>>>3) It expects both squares to occur in each frame. 
>>>>
>>>>Explanation 1 ignores 1 square from each frame of the video, because it 
>>>>can't match it. Hypothesis #1 doesn't have a reason for why the a new 
>>>>square appears in each frame and why one disappears. It doesn't expect 
>>>>these observations. In fact, explanation 1 doesn't expect anything that 
>>>>happens because something new happens in each frame, which doesn't give it 
>>>>a chance to confirm its hypotheses in subsequent frames.
>>>>
>>>>The power of this method is immediately clear. It is general and it solves 
>>>>the problem very cleanly.
>>>>
>>>>Here is a simplified version of how we solve case study 2:
>>>>We expect the original square to move at a similar velocity from left to 
>>>>right because we hypothesized that it did move from left to right and we 
>>>>calculated its velocity. If this expectation is confirmed, then it is more 
>>>>likely than saying that the square suddenly stopped and another started 
>>>>moving. Such a change would be unexpected and such a conclusion would be 
>>>>unjustifiable. 
>>>>
>>>>I also believe that explanations which generate fewer incorrect 
>>>>expectations should be preferred over those that more incorrect 
>>>>expectations.
>>>>
>>>>The idea I came up with earlier this month regarding high frame rates to 
>>>>reduce uncertainty is still applicable. It is important that all generated 
>>>>hypotheses have as low uncertainty as possible given our constraints and 
>>>>resources available.
>>>>
>>>>I thought I'd share my progress with you all. I'll be testing the ideas on 
>>>>test cases such as the ones I mentioned in the coming days and weeks. 
>>>>
>>>>Dave
>>>>
>>>
>>>>>>
>>>agi | Archives  >>> | Modify >>> Your Subscription  
>>>>>>
>>>agi | Archives  >>> | Modify >>> Your Subscription  
>>
>>>>
>>agi | Archives  >> | Modify >> Your Subscription  
>>>>
>>agi | Archives  >> | Modify >> Your Subscription  
>
>>
>agi | Archives  > | Modify > Your Subscription  
>>
>agi | Archives  > | Modify > Your Subscription  

agi | Archives  | Modify Your Subscription  


-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Re: Huge Progress on the Core of AGI

Reply via email to