Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-24 Thread David Jones
Abram,

I haven't found a method that I think works consistently yet. Basically I
was trying methods like the one you suggested, which measures the number of
correct predictions or expectations. But, then I ran into the problem of,
what if the predictions you are counting are more of the same? Do you count
them or not? For example, lets say that we see a piece of paper on a table
in an image and we see that the paper looks different but moves with the
table. So, we can hypothesize that they are attached. Now what if it is not
a piece of paper, but a mural. Do you count every little piece of the mural
that moves with the desk as a correct prediction? Is it a single prediction?
What about the number of times they move together? It doesn't seem right to
count each and every time, but we also have to be careful about coincidental
movement together. Just because it seems to move together in one frame out
of 1000 does not mean we should consider them temporarily attached.

So, quantitatively defining simpler and predictive is quite challenging. I
am honestly a bit stumped at how to do it at the moment. I will keep trying
to find ways to at least approximate it, but I'm really not sure the best
way.

Of course, I haven't been working on this specific problem long, but other
people have tried to quantify our explanatory methods in other areas and
have also failed. I think part of the failure has to do with the fact that
the things they want to explain using the same method should probably use
different methods and should be more heuristic than mathematically precise.
It's all quite overwhelming to analyze sometimes.

I may have thought about fractions correct vs. incorrect also. The truth is,
I haven't locked on and carefully analyzed the different ideas I've come up
with because they all seem to have issues and it is difficult to analyze. I
definitely need to try some out and just see what the results are and
document them better.

Dave

On Thu, Jul 22, 2010 at 10:23 PM, Abram Demski abramdem...@gmail.comwrote:

 David,

 What are the different ways you are thinking of for measuring the
 predictiveness? I can think of a few different possibilities (such as
 measuring number incorrect vs measuring fraction incorrect, et cetera) but
 I'm wondering which variations you consider significant/troublesome/etc.

 --Abram

 On Thu, Jul 22, 2010 at 7:12 PM, David Jones davidher...@gmail.comwrote:

 It's certainly not as simple as you claim. First, assigning a probability
 is not always possible, nor is it easy. The factors in calculating that
 probability are unknown and are not the same for every instance. Since we do
 not know what combination of observations we will see, we cannot have a
 predefined set of probabilities, nor is it any easier to create a
 probability function that generates them for us. That is just as exactly
 what I meant by quantitatively define the predictiveness... it would be
 proportional to the probability.

 Second, if you can define a program ina way that is always simpler when it
 is smaller, then you can do the same thing without a program. I don't think
 it makes any sense to do it this way.

 It is not that simple. If it was, we could solve a large portion of agi
 easily.

 On Thu, Jul 22, 2010 at 3:16 PM, Matt Mahoney matmaho...@yahoo.com
 wrote:

 David Jones wrote:

  But, I am amazed at how difficult it is to quantitatively define more
 predictive and simpler for specific problems.

 It isn't hard. To measure predictiveness, you assign a probability to each
 possible outcome. If the actual outcome has probability p, you score a
 penalty of log(1/p) bits. To measure simplicity, use the compressed size of
 the code for your prediction algorithm. Then add the two scores together.
 That's how it is done in the Calgary challenge
 http://www.mailcom.com/challenge/ and in my own text compression
 benchmark.



 -- Matt Mahoney, matmaho...@yahoo.com

 *From:* David Jones davidher...@gmail.com
 *To:* agi agi@v2.listbox.com
 *Sent:* Thu, July 22, 2010 3:11:46 PM
 *Subject:* Re: [agi] Re: Huge Progress on the Core of AGI

 Because simpler is not better if it is less predictive.

 On Thu, Jul 22, 2010 at 1:21 PM, Abram Demski abramdem...@gmail.com
 wrote:

 Jim,

 Why more predictive *and then* simpler?

 --Abram

 On Thu, Jul 22, 2010 at 11:49 AM, David Jones davidher...@gmail.com
 wrote:

  An Update

 I think the following gets to the heart of general AI and what it takes to
 achieve it. It also provides us with evidence as to why general AI is so
 difficult. With this new knowledge in mind, I think I will be much more
 capable now of solving the problems and making it work.

 I've come to the conclusion lately that the best hypothesis is better
 because it is more predictive and then simpler than other hypotheses (in
 that order more predictive... then simpler). But, I am amazed at how
 difficult it is to quantitatively define more predictive and simpler for
 specific problems

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-24 Thread David Jones
Abram,

I should also mention that I ran into problems mainly because I was having a
hard time deciding how to identify objects and determine what is really
going on in a scene. This adds a whole other layer of complexity to
hypotheses. It's not just about what is more predictive of the observations,
it is about deciding what exactly you are observing in the first place.
(although you might say its the same problem).

I ran into this problem when my algorithm finds matches between items that
are not the same. Or it may not find any matches between items that are the
same, but have changed. So, how do you decide whether it is 1) the same
object, 2) a different object or 3) the same object but it has changed.
And how do you decide its relationship to something else...  is it 1)
dependently attached 2) semi-dependently attached(can move independently,
but only in certain ways. Yet also moves dependently) 3) independent 4)
sometimes dependent 5) was dependent, but no longer is, 6) was dependent on
something else, but then was independent, but now is dependent on something
new.

These hypotheses are different ways of explaining the same observations, but
are complicated by the fact that we aren't sure of the identity of the
objects we are observing in the first place. Multiple hypotheses may fit the
same observations, and its hard to decide why one is simpler or better than
the other. The object you were observing at first may have disappeared. A
new object may have appeared at the same time (this is why screenshots are a
bit malicious). Or the object you were observing may have changed. In
screenshots, sometimes the objects that you are trying to identify as
different never appear at the same time because they always completely
occlude each other. So, that can make it extremely difficult to decide
whether they are the same object that has changed or different objects.

Such ambiguities are common in AGI. It is unclear to me yet how to deal with
them effectively, although I am continuing to work hard on it.

I know its a bit of a mess, but I'm just trying to demonstrate the trouble
I've run into.

I hope that makes it more clear why I'm having so much trouble finding a way
of determining what hypothesis is most predictive and simplest.

Dave

On Thu, Jul 22, 2010 at 10:23 PM, Abram Demski abramdem...@gmail.comwrote:

 David,

 What are the different ways you are thinking of for measuring the
 predictiveness? I can think of a few different possibilities (such as
 measuring number incorrect vs measuring fraction incorrect, et cetera) but
 I'm wondering which variations you consider significant/troublesome/etc.

 --Abram

 On Thu, Jul 22, 2010 at 7:12 PM, David Jones davidher...@gmail.comwrote:

 It's certainly not as simple as you claim. First, assigning a probability
 is not always possible, nor is it easy. The factors in calculating that
 probability are unknown and are not the same for every instance. Since we do
 not know what combination of observations we will see, we cannot have a
 predefined set of probabilities, nor is it any easier to create a
 probability function that generates them for us. That is just as exactly
 what I meant by quantitatively define the predictiveness... it would be
 proportional to the probability.

 Second, if you can define a program ina way that is always simpler when it
 is smaller, then you can do the same thing without a program. I don't think
 it makes any sense to do it this way.

 It is not that simple. If it was, we could solve a large portion of agi
 easily.

 On Thu, Jul 22, 2010 at 3:16 PM, Matt Mahoney matmaho...@yahoo.com
 wrote:

 David Jones wrote:

  But, I am amazed at how difficult it is to quantitatively define more
 predictive and simpler for specific problems.

 It isn't hard. To measure predictiveness, you assign a probability to each
 possible outcome. If the actual outcome has probability p, you score a
 penalty of log(1/p) bits. To measure simplicity, use the compressed size of
 the code for your prediction algorithm. Then add the two scores together.
 That's how it is done in the Calgary challenge
 http://www.mailcom.com/challenge/ and in my own text compression
 benchmark.



 -- Matt Mahoney, matmaho...@yahoo.com

 *From:* David Jones davidher...@gmail.com
 *To:* agi agi@v2.listbox.com
 *Sent:* Thu, July 22, 2010 3:11:46 PM
 *Subject:* Re: [agi] Re: Huge Progress on the Core of AGI

 Because simpler is not better if it is less predictive.

 On Thu, Jul 22, 2010 at 1:21 PM, Abram Demski abramdem...@gmail.com
 wrote:

 Jim,

 Why more predictive *and then* simpler?

 --Abram

 On Thu, Jul 22, 2010 at 11:49 AM, David Jones davidher...@gmail.com
 wrote:

  An Update

 I think the following gets to the heart of general AI and what it takes to
 achieve it. It also provides us with evidence as to why general AI is so
 difficult. With this new knowledge in mind, I think I will be much more
 capable now of solving the problems and making it work.

 I've

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-24 Thread Matt Mahoney
David Jones wrote:
 I should also mention that I ran into problems mainly because I was having a 
hard time deciding how to identify objects and determine what is really going 
on 
in a scene.

I think that your approach makes the problem harder than it needs to be (not 
that it is easy). Natural language processing is hard, so researchers in an 
attempt to break down the task into simpler parts, focused on steps like 
lexical 
analysis, parsing, part of speech resolution, and semantic analysis. While 
these 
problems went unsolved, Google went directly to a solution by skipping them.

Likewise, parsing an image into physically separate objects and then building a 
3-D model makes the problem harder, not easier. Again, look at the whole 
picture. You input an image and output a response. Let the system figure out 
which features are important. If your goal is to count basketball passes, then 
it is irrelevant whether the AGI recognizes that somebody is wearing a gorilla 
suit.

 -- Matt Mahoney, matmaho...@yahoo.com





From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Sat, July 24, 2010 2:25:49 PM
Subject: Re: [agi] Re: Huge Progress on the Core of AGI

Abram,

I should also mention that I ran into problems mainly because I was having a 
hard time deciding how to identify objects and determine what is really going 
on 
in a scene. This adds a whole other layer of complexity to hypotheses. It's not 
just about what is more predictive of the observations, it is about deciding 
what exactly you are observing in the first place. (although you might say its 
the same problem).

I ran into this problem when my algorithm finds matches between items that are 
not the same. Or it may not find any matches between items that are the same, 
but have changed. So, how do you decide whether it is 1) the same object, 2) a 
different object or 3) the same object but it has changed. 

And how do you decide its relationship to something else...  is it 1) 
dependently attached 2) semi-dependently attached(can move independently, but 
only in certain ways. Yet also moves dependently) 3) independent 4) sometimes 
dependent 5) was dependent, but no longer is, 6) was dependent on something 
else, but then was independent, but now is dependent on something new. 


These hypotheses are different ways of explaining the same observations, but 
are 
complicated by the fact that we aren't sure of the identity of the objects we 
are observing in the first place. Multiple hypotheses may fit the same 
observations, and its hard to decide why one is simpler or better than the 
other. The object you were observing at first may have disappeared. A new 
object 
may have appeared at the same time (this is why screenshots are a bit 
malicious). Or the object you were observing may have changed. In screenshots, 
sometimes the objects that you are trying to identify as different never appear 
at the same time because they always completely occlude each other. So, that 
can 
make it extremely difficult to decide whether they are the same object that has 
changed or different objects.

Such ambiguities are common in AGI. It is unclear to me yet how to deal with 
them effectively, although I am continuing to work hard on it. 


I know its a bit of a mess, but I'm just trying to demonstrate the trouble I've 
run into. 


I hope that makes it more clear why I'm having so much trouble finding a way of 
determining what hypothesis is most predictive and simplest.

Dave


On Thu, Jul 22, 2010 at 10:23 PM, Abram Demski abramdem...@gmail.com wrote:

David,

What are the different ways you are thinking of for measuring the 
predictiveness? I can think of a few different possibilities (such as 
measuring 
number incorrect vs measuring fraction incorrect, et cetera) but I'm wondering 
which variations you consider significant/troublesome/etc.

--Abram


On Thu, Jul 22, 2010 at 7:12 PM, David Jones davidher...@gmail.com wrote:

It's certainly not as simple as you claim. First, assigning a probability is 
not 
always possible, nor is it easy. The factors in calculating that probability 
are 
unknown and are not the same for every instance. Since we do not know what 
combination of observations we will see, we cannot have a predefined set of 
probabilities, nor is it any easier to create a probability function that 
generates them for us. That is just as exactly what I meant by quantitatively 
define the predictiveness... it would be proportional to the probability. 

Second, if you can define a program ina way that is always simpler when it is 
smaller, then you can do the same thing without a program. I don't think it 
makes any sense to do it this way. 

It is not that simple. If it was, we could solve a large portion of agi 
easily.
On Thu, Jul 22, 2010 at 3:16 PM, Matt Mahoney matmaho...@yahoo.com wrote:
David Jones wrote:
 But, I am amazed at how difficult it is to quantitatively define more

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-24 Thread Mike Tintner
Huh, Matt? What examples of this holistic scene analysis are there (or are 
you thinking about)?


From: Matt Mahoney 
Sent: Saturday, July 24, 2010 10:25 PM
To: agi 
Subject: Re: [agi] Re: Huge Progress on the Core of AGI


David Jones wrote:
 I should also mention that I ran into problems mainly because I was having a 
 hard time deciding how to identify objects and determine what is really going 
 on in a scene.


I think that your approach makes the problem harder than it needs to be (not 
that it is easy). Natural language processing is hard, so researchers in an 
attempt to break down the task into simpler parts, focused on steps like 
lexical analysis, parsing, part of speech resolution, and semantic analysis. 
While these problems went unsolved, Google went directly to a solution by 
skipping them.


Likewise, parsing an image into physically separate objects and then building a 
3-D model makes the problem harder, not easier. Again, look at the whole 
picture. You input an image and output a response. Let the system figure out 
which features are important. If your goal is to count basketball passes, then 
it is irrelevant whether the AGI recognizes that somebody is wearing a gorilla 
suit.

 


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-24 Thread Matt Mahoney
Mike Tintner wrote:
 Huh, Matt? What examples of this holistic scene analysis are there (or are 
you thinking about)?
 
I mean a neural model with increasingly complex features, as opposed to an 
algorithmic 3-D model (like video game graphics in reverse).

Of course David rejects such ideas ( http://practicalai.org/Prize/Default.aspx 
) 
even though the one proven working vision model uses it.

-- Matt Mahoney, matmaho...@yahoo.com





From: Mike Tintner tint...@blueyonder.co.uk
To: agi agi@v2.listbox.com
Sent: Sat, July 24, 2010 6:16:07 PM
Subject: Re: [agi] Re: Huge Progress on the Core of AGI


Huh, Matt? What examples of this holistic scene  analysis are there (or are 
you thinking about)?


From: Matt Mahoney 
Sent: Saturday, July 24, 2010 10:25 PM
To: agi 
Subject: Re: [agi] Re: Huge Progress on the Core of  AGI

David Jones wrote:
 I should also mention that I ran into  problems mainly because I was having a 
hard time deciding how to identify  objects and determine what is really going 
on in a scene.

I think that your approach makes the problem harder than it needs to be  (not 
that it is easy). Natural language processing is hard, so researchers in an  
attempt to break down the task into simpler parts, focused on steps like 
lexical  
analysis, parsing, part of speech resolution, and semantic analysis. While 
these  
problems went unsolved, Google went directly to a solution by skipping  them.

Likewise, parsing an image into physically separate objects and then  building 
a 
3-D model makes the problem harder, not easier. Again, look at the  whole 
picture. You input an image and output a response. Let the system figure  out 
which features are important. If your goal is to count basketball passes,  then 
it is irrelevant whether the AGI recognizes that somebody is wearing a  gorilla 
suit.

 
agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-24 Thread David Jones
Matt,

Any method must deal with similar, if not the same, ambiguities. You need to
show how neural nets solve this problem or how they solve agi goals while
completely skipping the problem. Until then, it is not a successful method.

Dave

On Jul 24, 2010 7:18 PM, Matt Mahoney matmaho...@yahoo.com wrote:

Mike Tintner wrote:
 Huh, Matt? What examples of this holistic scene analysis are there (or
are y...
I mean a neural model with increasingly complex features, as opposed to an
algorithmic 3-D model (like video game graphics in reverse).

Of course David rejects such ideas (
http://practicalai.org/Prize/Default.aspx ) even though the one proven
working vision model uses it.




-- Matt Mahoney, matmaho...@yahoo.com

--
*From:* Mike Tintner tint...@blueyonder.co.uk


To: agi agi@v2.listbox.com
*Sent:* Sat, July 24, 2010 6:16:07 PM


Subject: Re: [agi] Re: Huge Progress on the Core of AGI


Huh, Matt? What examples of this holistic scene analysis are there (or are
you thinking about)?

...
   *agi* | Archives https://www.listbox.com/member/archive/303/=now
https://www.listbox.com/member/archive/rss/303/ |
Modifyhttps://www.listbox.com/member/?;Your
Subscription
http://www.listbox.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-24 Thread Mike Tintner
Matt: 
I mean a neural model with increasingly complex features, as opposed to an 
algorithmic 3-D model (like video game graphics in reverse). Of course David 
rejects such ideas ( http://practicalai.org/Prize/Default.aspx ) even though 
the one proven working vision model uses it.


Which is? and does what?  (I'm starting to consider that vision and visual 
perception  -  or perhaps one should say common sense, since no sense in 
humans works independent of the others -  may well be considerably *more* 
complex than language. The evolutionary time required to develop our common 
sense perception and conception of the world was vastly greater than that 
required to develop language. And we are as a culture merely in our babbling 
infancy in beginning to understand how sensory images work and are processed).


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-24 Thread Matt Mahoney
Mike Tintner wrote:
 Which is?
 
The one right behind your eyes.

-- Matt Mahoney, matmaho...@yahoo.com





From: Mike Tintner tint...@blueyonder.co.uk
To: agi agi@v2.listbox.com
Sent: Sat, July 24, 2010 9:00:42 PM
Subject: Re: [agi] Re: Huge Progress on the Core of AGI


Matt: 
I mean a neural model with increasingly complex features, as opposed to an  
algorithmic 3-D model (like video game graphics in reverse). Of course David  
rejects such ideas ( http://practicalai.org/Prize/Default.aspx )  even though 
the one proven working vision model uses it.
 
 
Which is? and does what?  (I'm starting to consider that vision and  visual 
perception  -  or perhaps one should say common sense, since  no sense in 
humans works independent of the others -  may well be  considerably *more* 
complex than language. The evolutionary time required to  develop our common 
sense perception and conception of the world was vastly  greater than that 
required to develop language. And we are as a culture merely  in our babbling 
infancy in beginning to understand how sensory images work and  are processed).
agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-24 Thread David Jones
Check this out!

The title Space and time, not surface features, guide object persistence
says it all.

http://pbr.psychonomic-journals.org/content/14/6/1199.full.pdf

Over just the last couple days I have begun to realize that they are so
right. My idea before of using high frame rates is also spot on. The brain
does not use features as much as we think. First we construct a model of the
object, then we probably decide what features to index it with for future
search. If we know that the object occurs at a particular location in space,
then we can learn a great deal about it with very little ambiguity! Of
course, processing images at all is hard, but that's besides the point...
The point is that we can automatically learn about the world using high
frame rates and a simple heuristic for identifying specific objects in a
scene. Because we can reliably identify them, we can learn an extremely
large amount in a very short period of time. We can learn about how lighting
affects the colors, noise, size, shape, components, attachment
relationships, etc. etc.

So, it is very likely that screenshots are not simpler than real images!
lol. The objects in real images usually don't change as much, as drastically
or as quickly as the objects in screenshots. That means that we can use the
simple heuristics of size, shape, location and continuity of time to match
objects and learn about them.

Dave

On Sat, Jul 24, 2010 at 9:10 PM, Matt Mahoney matmaho...@yahoo.com wrote:

 Mike Tintner wrote:
  Which is?

 The one right behind your eyes.


 -- Matt Mahoney, matmaho...@yahoo.com


 --
 *From:* Mike Tintner tint...@blueyonder.co.uk
 *To:* agi agi@v2.listbox.com
 *Sent:* Sat, July 24, 2010 9:00:42 PM

 *Subject:* Re: [agi] Re: Huge Progress on the Core of AGI

 Matt:
 I mean a neural model with increasingly complex features, as opposed to an
 algorithmic 3-D model (like video game graphics in reverse). Of course David
 rejects such ideas ( http://practicalai.org/Prize/Default.aspx ) even
 though the one proven working vision model uses it.


 Which is? and does what?  (I'm starting to consider that vision and visual
 perception  -  or perhaps one should say common sense, since no sense in
 humans works independent of the others -  may well be considerably *more*
 complex than language. The evolutionary time required to develop our common
 sense perception and conception of the world was vastly greater than that
 required to develop language. And we are as a culture merely in our babbling
 infancy in beginning to understand how sensory images work and are
 processed).
*agi* | Archives https://www.listbox.com/member/archive/303/=now
 https://www.listbox.com/member/archive/rss/303/ | 
 Modifyhttps://www.listbox.com/member/?;Your Subscription
 http://www.listbox.com
*agi* | Archives https://www.listbox.com/member/archive/303/=now
 https://www.listbox.com/member/archive/rss/303/ | 
 Modifyhttps://www.listbox.com/member/?;Your Subscription
 http://www.listbox.com




---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-24 Thread David Jones
This is absolutely incredible. The answer was right there in the last
paragraph:

The present experiments suggest that the computation
of object persistence appears to rely so heavily upon spatiotemporal
information that it will not (or at least is unlikely
to) use otherwise available surface feature information,
particularly when there is conflicting spatiotemporal
information. This reveals a striking limitation, given various
theories that visual perception uses whatever shortcuts,
or heuristics, it can to simplify processing, as well as
the theory that perception evolves out of a buildup of the
statistical nature of our environment (e.g., Purves  Lotto,
2003). Instead, it appears that the object file system has
“tunnel vision” and turns a blind eye to surface feature information,
focusing on spatiotemporal information when
computing persistence.

So much for Matt's claim that the brain uses hierarchical features LOL

Dave

On Sat, Jul 24, 2010 at 11:52 PM, David Jones davidher...@gmail.com wrote:

 Check this out!

 The title Space and time, not surface features, guide object persistence
 says it all.

 http://pbr.psychonomic-journals.org/content/14/6/1199.full.pdf

 Over just the last couple days I have begun to realize that they are so
 right. My idea before of using high frame rates is also spot on. The brain
 does not use features as much as we think. First we construct a model of the
 object, then we probably decide what features to index it with for future
 search. If we know that the object occurs at a particular location in space,
 then we can learn a great deal about it with very little ambiguity! Of
 course, processing images at all is hard, but that's besides the point...
 The point is that we can automatically learn about the world using high
 frame rates and a simple heuristic for identifying specific objects in a
 scene. Because we can reliably identify them, we can learn an extremely
 large amount in a very short period of time. We can learn about how lighting
 affects the colors, noise, size, shape, components, attachment
 relationships, etc. etc.

 So, it is very likely that screenshots are not simpler than real images!
 lol. The objects in real images usually don't change as much, as drastically
 or as quickly as the objects in screenshots. That means that we can use the
 simple heuristics of size, shape, location and continuity of time to match
 objects and learn about them.

 Dave


 On Sat, Jul 24, 2010 at 9:10 PM, Matt Mahoney matmaho...@yahoo.comwrote:

 Mike Tintner wrote:
  Which is?

 The one right behind your eyes.


 -- Matt Mahoney, matmaho...@yahoo.com


 --
 *From:* Mike Tintner tint...@blueyonder.co.uk
 *To:* agi agi@v2.listbox.com
 *Sent:* Sat, July 24, 2010 9:00:42 PM

 *Subject:* Re: [agi] Re: Huge Progress on the Core of AGI

 Matt:
 I mean a neural model with increasingly complex features, as opposed to an
 algorithmic 3-D model (like video game graphics in reverse). Of course David
 rejects such ideas ( http://practicalai.org/Prize/Default.aspx ) even
 though the one proven working vision model uses it.


 Which is? and does what?  (I'm starting to consider that vision and visual
 perception  -  or perhaps one should say common sense, since no sense in
 humans works independent of the others -  may well be considerably *more*
 complex than language. The evolutionary time required to develop our common
 sense perception and conception of the world was vastly greater than that
 required to develop language. And we are as a culture merely in our babbling
 infancy in beginning to understand how sensory images work and are
 processed).
*agi* | Archives https://www.listbox.com/member/archive/303/=now
 https://www.listbox.com/member/archive/rss/303/ | 
 Modifyhttps://www.listbox.com/member/?;Your Subscription
 http://www.listbox.com
*agi* | Archives https://www.listbox.com/member/archive/303/=now
 https://www.listbox.com/member/archive/rss/303/ | 
 Modifyhttps://www.listbox.com/member/?;Your Subscription
 http://www.listbox.com






---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-22 Thread Abram Demski
Jim,

Why more predictive *and then* simpler?

--Abram

On Thu, Jul 22, 2010 at 11:49 AM, David Jones davidher...@gmail.com wrote:

 An Update

 I think the following gets to the heart of general AI and what it takes to
 achieve it. It also provides us with evidence as to why general AI is so
 difficult. With this new knowledge in mind, I think I will be much more
 capable now of solving the problems and making it work.

 I've come to the conclusion lately that the best hypothesis is better
 because it is more predictive and then simpler than other hypotheses (in
 that order more predictive... then simpler). But, I am amazed at how
 difficult it is to quantitatively define more predictive and simpler for
 specific problems. This is why I have sometimes doubted the truth of the
 statement.

 In addition, the observations that the AI gets are not representative of
 all observations! This means that if your measure of predictiveness
 depends on the number of certain observations, it could make mistakes! So,
 the specific observations you are aware of may be unrepresentative of the
 predictiveness of a hypothesis relative to the truth. If you try to
 calculate which hypothesis is more predictive and you don't have the
 critical observations that would give you the right answer, you may get the
 wrong answer! This all depends of course on your method of calculation,
 which is quite elusive to define.

 Visual input from screenshots, for example, can be somewhat malicious.
 Things can move, appear, disappear or occlude each other suddenly. So,
 without sufficient knowledge it is hard to decide whether matches you find
 between such large changes are because it is the same object or a different
 object. This may indicate that bias and preprogrammed experience should be
 introduced to the AI before training. Either that or the training inputs
 should be carefully chosen to avoid malicious input and to make them nice
 for learning.

 This is the correspondence problem that is typical of computer vision and
 has never been properly solved. Such malicious input also makes it difficult
 to learn automatically because the AI doesn't have sufficient experience to
 know which changes or transformations are acceptable and which are not. It
 is immediately bombarded with malicious inputs.

 I've also realized that if a hypothesis is more explanatory, it may be
 better. But quantitatively defining explanatory is also elusive and truly
 depends on the specific problems you are applying it to because it is a
 heuristic. It is not a true measure of correctness. It is not loyal to the
 truth. More explanatory is really a heuristic that helps us find
 hypothesis that are more predictive. The true measure of whether a
 hypothesis is better is simply the most accurate and predictive hypothesis.
 That is the ultimate and true measure of correctness.

 Also, since we can't measure every possible prediction or every last
 prediction (and we certainly can't predict everything), our measure of
 predictiveness can't possibly be right all the time! We have no choice but
 to use a heuristic of some kind.

 So, its clear to me that the right hypothesis is more predictive and then
 simpler. But, it is also clear that there will never be a single measure of
 this that can be applied to all problems. I hope to eventually find a nice
 model for how to apply it to different problems though. This may be the
 reason that so many people have tried and failed to develop general AI. Yes,
 there is a solution. But there is no silver bullet that can be applied to
 all problems. Some methods are better than others. But I think another major
 reason of the failures is that people think they can predict things without
 sufficient information. By approaching the problem this way, we compound the
 need for heuristics and the errors they produce because we simply don't have
 sufficient information to make a good decision with limited evidence. If
 approached correctly, the right solution would solve many more problems with
 the same efforts than a poor solution would. It would also eliminate some of
 the difficulties we currently face if sufficient data is available to learn
 from.

 In addition to all this theory about better hypotheses, you have to add on
 the need to solve problems in reasonable time. This also compounds the
 difficulty of the problem and the complexity of solutions.

 I am always fascinated by the extraordinary difficulty and complexity of
 this problem. The more I learn about it, the more I appreciate it.

 Dave
*agi* | Archives https://www.listbox.com/member/archive/303/=now
 https://www.listbox.com/member/archive/rss/303/ | 
 Modifyhttps://www.listbox.com/member/?;Your Subscription
 http://www.listbox.com




-- 
Abram Demski
http://lo-tho.blogspot.com/
http://groups.google.com/group/one-logic



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: 

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-22 Thread Mike Tintner
Predicting the old and predictable  [incl in shape and form] is narrow AI. 
Squaresville.
Adapting to the new and unpredictable [incl in shape and form] is AGI. Rock on.


From: David Jones 
Sent: Thursday, July 22, 2010 4:49 PM
To: agi 
Subject: [agi] Re: Huge Progress on the Core of AGI


An Update

I think the following gets to the heart of general AI and what it takes to 
achieve it. It also provides us with evidence as to why general AI is so 
difficult. With this new knowledge in mind, I think I will be much more capable 
now of solving the problems and making it work. 

I've come to the conclusion lately that the best hypothesis is better because 
it is more predictive and then simpler than other hypotheses (in that order 
more predictive... then simpler). But, I am amazed at how difficult it is to 
quantitatively define more predictive and simpler for specific problems. This 
is why I have sometimes doubted the truth of the statement.

In addition, the observations that the AI gets are not representative of all 
observations! This means that if your measure of predictiveness depends on 
the number of certain observations, it could make mistakes! So, the specific 
observations you are aware of may be unrepresentative of the predictiveness of 
a hypothesis relative to the truth. If you try to calculate which hypothesis is 
more predictive and you don't have the critical observations that would give 
you the right answer, you may get the wrong answer! This all depends of course 
on your method of calculation, which is quite elusive to define. 

Visual input from screenshots, for example, can be somewhat malicious. Things 
can move, appear, disappear or occlude each other suddenly. So, without 
sufficient knowledge it is hard to decide whether matches you find between such 
large changes are because it is the same object or a different object. This may 
indicate that bias and preprogrammed experience should be introduced to the AI 
before training. Either that or the training inputs should be carefully chosen 
to avoid malicious input and to make them nice for learning. 

This is the correspondence problem that is typical of computer vision and has 
never been properly solved. Such malicious input also makes it difficult to 
learn automatically because the AI doesn't have sufficient experience to know 
which changes or transformations are acceptable and which are not. It is 
immediately bombarded with malicious inputs.

I've also realized that if a hypothesis is more explanatory, it may be 
better. But quantitatively defining explanatory is also elusive and truly 
depends on the specific problems you are applying it to because it is a 
heuristic. It is not a true measure of correctness. It is not loyal to the 
truth. More explanatory is really a heuristic that helps us find hypothesis 
that are more predictive. The true measure of whether a hypothesis is better is 
simply the most accurate and predictive hypothesis. That is the ultimate and 
true measure of correctness.

Also, since we can't measure every possible prediction or every last prediction 
(and we certainly can't predict everything), our measure of predictiveness 
can't possibly be right all the time! We have no choice but to use a heuristic 
of some kind.

So, its clear to me that the right hypothesis is more predictive and then 
simpler. But, it is also clear that there will never be a single measure of 
this that can be applied to all problems. I hope to eventually find a nice 
model for how to apply it to different problems though. This may be the reason 
that so many people have tried and failed to develop general AI. Yes, there is 
a solution. But there is no silver bullet that can be applied to all problems. 
Some methods are better than others. But I think another major reason of the 
failures is that people think they can predict things without sufficient 
information. By approaching the problem this way, we compound the need for 
heuristics and the errors they produce because we simply don't have sufficient 
information to make a good decision with limited evidence. If approached 
correctly, the right solution would solve many more problems with the same 
efforts than a poor solution would. It would also eliminate some of the 
difficulties we currently face if sufficient data is available to learn from.

In addition to all this theory about better hypotheses, you have to add on the 
need to solve problems in reasonable time. This also compounds the difficulty 
of the problem and the complexity of solutions.

I am always fascinated by the extraordinary difficulty and complexity of this 
problem. The more I learn about it, the more I appreciate it.

Dave

  agi | Archives  | Modify Your Subscription   



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-22 Thread David Jones
Because simpler is not better if it is less predictive.


On Thu, Jul 22, 2010 at 1:21 PM, Abram Demski abramdem...@gmail.com wrote:

 Jim,

 Why more predictive *and then* simpler?

 --Abram

 On Thu, Jul 22, 2010 at 11:49 AM, David Jones davidher...@gmail.comwrote:

 An Update

 I think the following gets to the heart of general AI and what it takes to
 achieve it. It also provides us with evidence as to why general AI is so
 difficult. With this new knowledge in mind, I think I will be much more
 capable now of solving the problems and making it work.

 I've come to the conclusion lately that the best hypothesis is better
 because it is more predictive and then simpler than other hypotheses (in
 that order more predictive... then simpler). But, I am amazed at how
 difficult it is to quantitatively define more predictive and simpler for
 specific problems. This is why I have sometimes doubted the truth of the
 statement.

 In addition, the observations that the AI gets are not representative of
 all observations! This means that if your measure of predictiveness
 depends on the number of certain observations, it could make mistakes! So,
 the specific observations you are aware of may be unrepresentative of the
 predictiveness of a hypothesis relative to the truth. If you try to
 calculate which hypothesis is more predictive and you don't have the
 critical observations that would give you the right answer, you may get the
 wrong answer! This all depends of course on your method of calculation,
 which is quite elusive to define.

 Visual input from screenshots, for example, can be somewhat malicious.
 Things can move, appear, disappear or occlude each other suddenly. So,
 without sufficient knowledge it is hard to decide whether matches you find
 between such large changes are because it is the same object or a different
 object. This may indicate that bias and preprogrammed experience should be
 introduced to the AI before training. Either that or the training inputs
 should be carefully chosen to avoid malicious input and to make them nice
 for learning.

 This is the correspondence problem that is typical of computer vision
 and has never been properly solved. Such malicious input also makes it
 difficult to learn automatically because the AI doesn't have sufficient
 experience to know which changes or transformations are acceptable and which
 are not. It is immediately bombarded with malicious inputs.

 I've also realized that if a hypothesis is more explanatory, it may be
 better. But quantitatively defining explanatory is also elusive and truly
 depends on the specific problems you are applying it to because it is a
 heuristic. It is not a true measure of correctness. It is not loyal to the
 truth. More explanatory is really a heuristic that helps us find
 hypothesis that are more predictive. The true measure of whether a
 hypothesis is better is simply the most accurate and predictive hypothesis.
 That is the ultimate and true measure of correctness.

 Also, since we can't measure every possible prediction or every last
 prediction (and we certainly can't predict everything), our measure of
 predictiveness can't possibly be right all the time! We have no choice but
 to use a heuristic of some kind.

 So, its clear to me that the right hypothesis is more predictive and then
 simpler. But, it is also clear that there will never be a single measure of
 this that can be applied to all problems. I hope to eventually find a nice
 model for how to apply it to different problems though. This may be the
 reason that so many people have tried and failed to develop general AI. Yes,
 there is a solution. But there is no silver bullet that can be applied to
 all problems. Some methods are better than others. But I think another major
 reason of the failures is that people think they can predict things without
 sufficient information. By approaching the problem this way, we compound the
 need for heuristics and the errors they produce because we simply don't have
 sufficient information to make a good decision with limited evidence. If
 approached correctly, the right solution would solve many more problems with
 the same efforts than a poor solution would. It would also eliminate some of
 the difficulties we currently face if sufficient data is available to learn
 from.

 In addition to all this theory about better hypotheses, you have to add on
 the need to solve problems in reasonable time. This also compounds the
 difficulty of the problem and the complexity of solutions.

 I am always fascinated by the extraordinary difficulty and complexity of
 this problem. The more I learn about it, the more I appreciate it.

 Dave
*agi* | Archives https://www.listbox.com/member/archive/303/=now
 https://www.listbox.com/member/archive/rss/303/ | 
 Modifyhttps://www.listbox.com/member/?;Your Subscription
 http://www.listbox.com




 --
 Abram Demski
 http://lo-tho.blogspot.com/
 

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-22 Thread Matt Mahoney
David Jones wrote:
 But, I am amazed at how difficult it is to quantitatively define more 
predictive and simpler for specific problems. 

It isn't hard. To measure predictiveness, you assign a probability to each 
possible outcome. If the actual outcome has probability p, you score a penalty 
of log(1/p) bits. To measure simplicity, use the compressed size of the code 
for 
your prediction algorithm. Then add the two scores together. That's how it is 
done in the Calgary challenge http://www.mailcom.com/challenge/ and in my own 
text compression benchmark.

 -- Matt Mahoney, matmaho...@yahoo.com





From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Thu, July 22, 2010 3:11:46 PM
Subject: Re: [agi] Re: Huge Progress on the Core of AGI

Because simpler is not better if it is less predictive.



On Thu, Jul 22, 2010 at 1:21 PM, Abram Demski abramdem...@gmail.com wrote:

Jim,

Why more predictive *and then* simpler?

--Abram


On Thu, Jul 22, 2010 at 11:49 AM, David Jones davidher...@gmail.com wrote:

An Update

I think the following gets to the heart of general AI and what it takes to  
achieve it. It also provides us with evidence as to why general AI is so 
difficult. With this new knowledge in mind, I think I will be much more 
capable 
now  of solving the problems and making it work. 


I've come to the conclusion lately that the best hypothesis is better because 
it 
is more predictive and then simpler than other hypotheses (in that order 
more predictive... then simpler). But, I am amazed at how difficult it is to 
quantitatively define more predictive and simpler for specific problems. This 
is 
why I have sometimes doubted the truth of the statement.

In addition, the observations that the AI gets are not representative of all 
observations! This means that if your measure of predictiveness depends on 
the 
number of certain observations, it could make mistakes! So, the specific 
observations you are aware of may be unrepresentative of the predictiveness 
of a 
hypothesis relative to the truth. If you try to calculate which hypothesis is 
more predictive and you don't have the critical observations that would give 
you 
the right answer, you may get the wrong answer! This all depends of course on 
your method of calculation, which is quite elusive to define. 


Visual input from screenshots, for example, can be somewhat malicious. Things 
can move, appear, disappear or occlude each other suddenly. So, without 
sufficient knowledge it is hard to decide whether matches you find between 
such 
large changes are because it is the same object or a different object. This 
may 
indicate that bias and preprogrammed experience should be introduced to the 
AI 
before training. Either that or the training inputs should be carefully 
chosen 
to avoid malicious input and to make them nice for learning. 


This is the correspondence problem that is typical of computer vision and 
has 
never been properly solved. Such malicious input also makes it difficult to 
learn automatically because the AI doesn't have sufficient experience to know 
which changes or transformations are acceptable and which are not. It is 
immediately bombarded with malicious inputs.

I've also realized that if a hypothesis is more explanatory, it may be 
better. 
But quantitatively defining explanatory is also elusive and truly depends on 
the 
specific problems you are applying it to because it is a heuristic. It is not 
a 
true measure of correctness. It is not loyal to the truth. More explanatory 
is 
really a heuristic that helps us find hypothesis that are more predictive. 
The 
true measure of whether a hypothesis is better is simply the most accurate 
and 
predictive hypothesis. That is the ultimate and true measure of correctness.

Also, since we can't measure every possible prediction or every last 
prediction 
(and we certainly can't predict everything), our measure of predictiveness 
can't 
possibly be right all the time! We have no choice but to use a heuristic of 
some 
kind.

So, its clear to me that the right hypothesis is more predictive and then 
simpler. But, it is also clear that there will never be a single measure of 
this that can be applied to all problems. I hope to eventually find a nice 
model 
for how to apply it to different problems though. This may be the reason that 
so 
many people have tried and failed to develop general AI. Yes, there is a 
solution. But there is no silver bullet that can be applied to all problems. 
Some methods are better than others. But I think another major reason of the 
failures is that people think they can predict things without sufficient 
information. By approaching the problem this way, we compound the need for 
heuristics and the errors they produce because we simply don't have 
sufficient 
information to make a good decision with limited evidence. If approached 
correctly, the right solution would solve many more

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-22 Thread David Jones
It's certainly not as simple as you claim. First, assigning a probability is
not always possible, nor is it easy. The factors in calculating that
probability are unknown and are not the same for every instance. Since we do
not know what combination of observations we will see, we cannot have a
predefined set of probabilities, nor is it any easier to create a
probability function that generates them for us. That is just as exactly
what I meant by quantitatively define the predictiveness... it would be
proportional to the probability.

Second, if you can define a program ina way that is always simpler when it
is smaller, then you can do the same thing without a program. I don't think
it makes any sense to do it this way.

It is not that simple. If it was, we could solve a large portion of agi
easily.

On Thu, Jul 22, 2010 at 3:16 PM, Matt Mahoney matmaho...@yahoo.com wrote:

David Jones wrote:

 But, I am amazed at how difficult it is to quantitatively define more
predictive and simpler for specific problems.

It isn't hard. To measure predictiveness, you assign a probability to each
possible outcome. If the actual outcome has probability p, you score a
penalty of log(1/p) bits. To measure simplicity, use the compressed size of
the code for your prediction algorithm. Then add the two scores together.
That's how it is done in the Calgary challenge
http://www.mailcom.com/challenge/ and in my own text compression benchmark.



-- Matt Mahoney, matmaho...@yahoo.com

*From:* David Jones davidher...@gmail.com
*To:* agi agi@v2.listbox.com
*Sent:* Thu, July 22, 2010 3:11:46 PM
*Subject:* Re: [agi] Re: Huge Progress on the Core of AGI

Because simpler is not better if it is less predictive.

On Thu, Jul 22, 2010 at 1:21 PM, Abram Demski abramdem...@gmail.com wrote:

Jim,

Why more predictive *and then* simpler?

--Abram

On Thu, Jul 22, 2010 at 11:49 AM, David Jones davidher...@gmail.com wrote:

 An Update

I think the following gets to the heart of general AI and what it takes to
achieve it. It also provides us with evidence as to why general AI is so
difficult. With this new knowledge in mind, I think I will be much more
capable now of solving the problems and making it work.

I've come to the conclusion lately that the best hypothesis is better
because it is more predictive and then simpler than other hypotheses (in
that order more predictive... then simpler). But, I am amazed at how
difficult it is to quantitatively define more predictive and simpler for
specific problems. This is why I have sometimes doubted the truth of the
statement.

In addition, the observations that the AI gets are not representative of all
observations! This means that if your measure of predictiveness depends on
the number of certain observations, it could make mistakes! So, the specific
observations you are aware of may be unrepresentative of the predictiveness
of a hypothesis relative to the truth. If you try to calculate which
hypothesis is more predictive and you don't have the critical observations
that would give you the right answer, you may get the wrong answer! This all
depends of course on your method of calculation, which is quite elusive to
define.

Visual input from screenshots, for example, can be somewhat malicious.
Things can move, appear, disappear or occlude each other suddenly. So,
without sufficient knowledge it is hard to decide whether matches you find
between such large changes are because it is the same object or a different
object. This may indicate that bias and preprogrammed experience should be
introduced to the AI before training. Either that or the training inputs
should be carefully chosen to avoid malicious input and to make them nice
for learning.

This is the correspondence problem that is typical of computer vision and
has never been properly solved. Such malicious input also makes it difficult
to learn automatically because the AI doesn't have sufficient experience to
know which changes or transformations are acceptable and which are not. It
is immediately bombarded with malicious inputs.

I've also realized that if a hypothesis is more explanatory, it may be
better. But quantitatively defining explanatory is also elusive and truly
depends on the specific problems you are applying it to because it is a
heuristic. It is not a true measure of correctness. It is not loyal to the
truth. More explanatory is really a heuristic that helps us find
hypothesis that are more predictive. The true measure of whether a
hypothesis is better is simply the most accurate and predictive hypothesis.
That is the ultimate and true measure of correctness.

Also, since we can't measure every possible prediction or every last
prediction (and we certainly can't predict everything), our measure of
predictiveness can't possibly be right all the time! We have no choice but
to use a heuristic of some kind.

So, its clear to me that the right hypothesis is more predictive and then
simpler. But, it is also clear

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-22 Thread Abram Demski
ps-- Sorry for accidentally calling you Jim!

On Thu, Jul 22, 2010 at 1:21 PM, Abram Demski abramdem...@gmail.com wrote:

 Jim,

 Why more predictive *and then* simpler?

 --Abram


 On Thu, Jul 22, 2010 at 11:49 AM, David Jones davidher...@gmail.comwrote:

 An Update

 I think the following gets to the heart of general AI and what it takes to
 achieve it. It also provides us with evidence as to why general AI is so
 difficult. With this new knowledge in mind, I think I will be much more
 capable now of solving the problems and making it work.

 I've come to the conclusion lately that the best hypothesis is better
 because it is more predictive and then simpler than other hypotheses (in
 that order more predictive... then simpler). But, I am amazed at how
 difficult it is to quantitatively define more predictive and simpler for
 specific problems. This is why I have sometimes doubted the truth of the
 statement.

 In addition, the observations that the AI gets are not representative of
 all observations! This means that if your measure of predictiveness
 depends on the number of certain observations, it could make mistakes! So,
 the specific observations you are aware of may be unrepresentative of the
 predictiveness of a hypothesis relative to the truth. If you try to
 calculate which hypothesis is more predictive and you don't have the
 critical observations that would give you the right answer, you may get the
 wrong answer! This all depends of course on your method of calculation,
 which is quite elusive to define.

 Visual input from screenshots, for example, can be somewhat malicious.
 Things can move, appear, disappear or occlude each other suddenly. So,
 without sufficient knowledge it is hard to decide whether matches you find
 between such large changes are because it is the same object or a different
 object. This may indicate that bias and preprogrammed experience should be
 introduced to the AI before training. Either that or the training inputs
 should be carefully chosen to avoid malicious input and to make them nice
 for learning.

 This is the correspondence problem that is typical of computer vision
 and has never been properly solved. Such malicious input also makes it
 difficult to learn automatically because the AI doesn't have sufficient
 experience to know which changes or transformations are acceptable and which
 are not. It is immediately bombarded with malicious inputs.

 I've also realized that if a hypothesis is more explanatory, it may be
 better. But quantitatively defining explanatory is also elusive and truly
 depends on the specific problems you are applying it to because it is a
 heuristic. It is not a true measure of correctness. It is not loyal to the
 truth. More explanatory is really a heuristic that helps us find
 hypothesis that are more predictive. The true measure of whether a
 hypothesis is better is simply the most accurate and predictive hypothesis.
 That is the ultimate and true measure of correctness.

 Also, since we can't measure every possible prediction or every last
 prediction (and we certainly can't predict everything), our measure of
 predictiveness can't possibly be right all the time! We have no choice but
 to use a heuristic of some kind.

 So, its clear to me that the right hypothesis is more predictive and then
 simpler. But, it is also clear that there will never be a single measure of
 this that can be applied to all problems. I hope to eventually find a nice
 model for how to apply it to different problems though. This may be the
 reason that so many people have tried and failed to develop general AI. Yes,
 there is a solution. But there is no silver bullet that can be applied to
 all problems. Some methods are better than others. But I think another major
 reason of the failures is that people think they can predict things without
 sufficient information. By approaching the problem this way, we compound the
 need for heuristics and the errors they produce because we simply don't have
 sufficient information to make a good decision with limited evidence. If
 approached correctly, the right solution would solve many more problems with
 the same efforts than a poor solution would. It would also eliminate some of
 the difficulties we currently face if sufficient data is available to learn
 from.

 In addition to all this theory about better hypotheses, you have to add on
 the need to solve problems in reasonable time. This also compounds the
 difficulty of the problem and the complexity of solutions.

 I am always fascinated by the extraordinary difficulty and complexity of
 this problem. The more I learn about it, the more I appreciate it.

 Dave
*agi* | Archives https://www.listbox.com/member/archive/303/=now
 https://www.listbox.com/member/archive/rss/303/ | 
 Modifyhttps://www.listbox.com/member/?;Your Subscription
 http://www.listbox.com




 --
 Abram Demski
 http://lo-tho.blogspot.com/
 http://groups.google.com/group/one-logic


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-22 Thread Abram Demski
David,

What are the different ways you are thinking of for measuring the
predictiveness? I can think of a few different possibilities (such as
measuring number incorrect vs measuring fraction incorrect, et cetera) but
I'm wondering which variations you consider significant/troublesome/etc.

--Abram

On Thu, Jul 22, 2010 at 7:12 PM, David Jones davidher...@gmail.com wrote:

 It's certainly not as simple as you claim. First, assigning a probability
 is not always possible, nor is it easy. The factors in calculating that
 probability are unknown and are not the same for every instance. Since we do
 not know what combination of observations we will see, we cannot have a
 predefined set of probabilities, nor is it any easier to create a
 probability function that generates them for us. That is just as exactly
 what I meant by quantitatively define the predictiveness... it would be
 proportional to the probability.

 Second, if you can define a program ina way that is always simpler when it
 is smaller, then you can do the same thing without a program. I don't think
 it makes any sense to do it this way.

 It is not that simple. If it was, we could solve a large portion of agi
 easily.

 On Thu, Jul 22, 2010 at 3:16 PM, Matt Mahoney matmaho...@yahoo.com
 wrote:

 David Jones wrote:

  But, I am amazed at how difficult it is to quantitatively define more
 predictive and simpler for specific problems.

 It isn't hard. To measure predictiveness, you assign a probability to each
 possible outcome. If the actual outcome has probability p, you score a
 penalty of log(1/p) bits. To measure simplicity, use the compressed size of
 the code for your prediction algorithm. Then add the two scores together.
 That's how it is done in the Calgary challenge
 http://www.mailcom.com/challenge/ and in my own text compression
 benchmark.



 -- Matt Mahoney, matmaho...@yahoo.com

 *From:* David Jones davidher...@gmail.com
 *To:* agi agi@v2.listbox.com
 *Sent:* Thu, July 22, 2010 3:11:46 PM
 *Subject:* Re: [agi] Re: Huge Progress on the Core of AGI

 Because simpler is not better if it is less predictive.

 On Thu, Jul 22, 2010 at 1:21 PM, Abram Demski abramdem...@gmail.com
 wrote:

 Jim,

 Why more predictive *and then* simpler?

 --Abram

 On Thu, Jul 22, 2010 at 11:49 AM, David Jones davidher...@gmail.com
 wrote:

  An Update

 I think the following gets to the heart of general AI and what it takes to
 achieve it. It also provides us with evidence as to why general AI is so
 difficult. With this new knowledge in mind, I think I will be much more
 capable now of solving the problems and making it work.

 I've come to the conclusion lately that the best hypothesis is better
 because it is more predictive and then simpler than other hypotheses (in
 that order more predictive... then simpler). But, I am amazed at how
 difficult it is to quantitatively define more predictive and simpler for
 specific problems. This is why I have sometimes doubted the truth of the
 statement.

 In addition, the observations that the AI gets are not representative of
 all observations! This means that if your measure of predictiveness
 depends on the number of certain observations, it could make mistakes! So,
 the specific observations you are aware of may be unrepresentative of the
 predictiveness of a hypothesis relative to the truth. If you try to
 calculate which hypothesis is more predictive and you don't have the
 critical observations that would give you the right answer, you may get the
 wrong answer! This all depends of course on your method of calculation,
 which is quite elusive to define.

 Visual input from screenshots, for example, can be somewhat malicious.
 Things can move, appear, disappear or occlude each other suddenly. So,
 without sufficient knowledge it is hard to decide whether matches you find
 between such large changes are because it is the same object or a different
 object. This may indicate that bias and preprogrammed experience should be
 introduced to the AI before training. Either that or the training inputs
 should be carefully chosen to avoid malicious input and to make them nice
 for learning.

 This is the correspondence problem that is typical of computer vision and
 has never been properly solved. Such malicious input also makes it difficult
 to learn automatically because the AI doesn't have sufficient experience to
 know which changes or transformations are acceptable and which are not. It
 is immediately bombarded with malicious inputs.

 I've also realized that if a hypothesis is more explanatory, it may be
 better. But quantitatively defining explanatory is also elusive and truly
 depends on the specific problems you are applying it to because it is a
 heuristic. It is not a true measure of correctness. It is not loyal to the
 truth. More explanatory is really a heuristic that helps us find
 hypothesis that are more predictive. The true measure of whether a
 hypothesis is better is simply the most

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-13 Thread Abram Demski
PS-- I am not denying that statistics is applied probability theory. :) When
I say they are different, what I mean is that saying I'm going to use
probability theory and I'm going to use statistics tend to indicate very
different approaches. Probability is a set of axioms, whereas statistics is
a set of methods. The probability theory camp tends to be bayesian, whereas
the stats camp tends to be frequentist.

Your complaint that probability theory doesn't try to figure out why it was
wrong in the 30% (or whatever) it misses is a common objection. Probability
theory glosses over important detail, it encourages lazy thinking, etc.
However, this all depends on the space of hypotheses being examined.
Statistical methods will be prone to this objection because they are
essentially narrow-AI methods: they don't *try* to search in the space of
all hypotheses a human might consider. An AGI setup can and should have such
a large hypothesis space. Note that AIXI is typically formulated as using a
space of crisp (non-probabilistic) hypotheses, though probability theory is
used to reason about them. This means no theory it considers will gloss over
detail in this way: every theory completely explains the data. (I use AIXI
as a convenient example, not because I agree with it.)

--Abram

On Mon, Jul 12, 2010 at 2:42 PM, Abram Demski abramdem...@gmail.com wrote:

 David,

 I tend to think of probability theory and statistics as different things.
 I'd agree that statistics is not enough for AGI, but in contrast I think
 probability theory is a pretty good foundation. Bayesianism to me provides a
 sound way of integrating the elegance/utility tradeoff of explanation-based
 reasoning into the basic fabric of the uncertainty calculus. Others advocate
 different sorts of uncertainty than probabilities, but so far what I've seen
 indicates more a lack of ability to apply probability theory than a need for
 a new type of uncertainty. What other methods do you favor for dealing with
 these things?

 --Abram


 On Sun, Jul 11, 2010 at 12:30 PM, David Jones davidher...@gmail.comwrote:

 Thanks Abram,

 I know that probability is one approach. But there are many problems with
 using it in actual implementations. I know a lot of people will be angered
 by that statement and retort with all the successes that they have had using
 probability. But, the truth is that you can solve the problems many ways and
 every way has its pros and cons. I personally believe that probability has
 unacceptable cons if used all by itself. It must only be used when it is the
 best tool for the task.

 I do plan to use some probability within my approach. But only when it
 makes sense to do so. I do not believe in completely statistical solutions
 or completely Bayesian machine learning alone.

 A good example of when I might use it is when a particular hypothesis
 predicts something with 70% accuracy, well it may be better than any other
 hypothesis we can come up with so far. So, we may use that hypothesis. But,
 the 30% unexplained errors should be explained if possible with the
 resources and algorithms available, if at all possible. This is where my
 method differs from statistical methods. I want to build algorithms that
 resolve the 30% and explain it. For many problems, there are rules and
 knowledge that will solve them effectively. Probability should only be used
 when you cannot find a more accurate solution.

 Basically we should use probability when we don't know the factors
 involved, can't find any rules to explain the phenomena or we don't have the
 time and resources to figure it out. So you must simply guess at the most
 probable event without any rules for figuring out which event is more
 applicable under the current circumstances.

 So, in summary, probability definitely has its place. I just think that
 explanatory reasoning and other more accurate methods should be preferred
 whenever possible.

 Regarding learning the knowledge being the bigger problem, I completely
 agree. That is why I think it is so important to develop machine learning
 that can learn by direct observation of the environment. Without that, it is
 practically impossible to gather the knowledge required for AGI-type
 applications. We can learn this knowledge by analyzing the world
 automatically and generally through video.

 My step by step approach for learning and then applying the knowledge for
 agi is as follows:
 1) Understand and learn about the environment(through Computer Vision for
 now and other sensory perceptions in the future)
 2) learn about your own actions and how they affect the environment
 3) learn about language and how it is associated with or related to the
 environment.
 4) learn goals from language(such as through dedicated inputs).
 5) Goal pursuit
 6) Other Miscellaneous capabilities as needed

 Dave

 On Sat, Jul 10, 2010 at 8:40 PM, Abram Demski abramdem...@gmail.comwrote:

 David,

 Sorry for the slow response.

 I agree 

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-13 Thread Jim Bromer
On Tue, Jul 13, 2010 at 2:29 AM, Abram Demski abramdem...@gmail.com wrote:
[The]complaint that probability theory doesn't try to figure out why it was
wrong in the 30% (or whatever) it misses is a common objection. Probability
theory glosses over important detail, it encourages lazy thinking, etc.
However, this all depends on the space of hypotheses being examined.
Statistical methods will be prone to this objection because they are
essentially narrow-AI methods: they don't *try* to search in the space of
all hypotheses a human might consider. An AGI setup can and should have such
a large hypothesis space.
---
That is the thing.
We cannot search all possible hypotheses because we could not even write all
possible hypotheses down.  This is why hypotheses have to be formed
creatively in response to an analysis of a situation.  In my arrogant
opinion, this is best done through a method that creatively uses discreet
representations.  Of course it can use statistical or probabilistic data in
making those creative hypotheses when there is good data to be used.  But
the best way to do this is through categorization based creativity.  But
this is an imaginative method, one which creates imaginative
explanations (or other co-relations) for observed or conjectured events.
Those imaginative hypotheses then have to be compared to a situation through
some trial and error methods.  Then the tentative conjectures that seem to
withstand initial tests have to be further integrated into other hypotheses,
conjectures and explanations that are related to the subject of the
hypotheses.   This process of conceptual integration, a process which has to
rely on both creative methods and rational methods, is a fundamental part of
the process which does not seem to be clearly understood.  Conceptual
Integration cannot be accomplished by reducing a concept to True or False or
to some number from 0 to 1 and then combined with other concepts that were
also so reduced.  Ideas take on roles when combined with other ideas.
Basically, a new idea has to be fit into a complex of other ideas that are
strongly related to it.

Jim Bromer





On Tue, Jul 13, 2010 at 2:29 AM, Abram Demski abramdem...@gmail.com wrote:

 PS-- I am not denying that statistics is applied probability theory. :)
 When I say they are different, what I mean is that saying I'm going to use
 probability theory and I'm going to use statistics tend to indicate very
 different approaches. Probability is a set of axioms, whereas statistics is
 a set of methods. The probability theory camp tends to be bayesian, whereas
 the stats camp tends to be frequentist.

 Your complaint that probability theory doesn't try to figure out why it was
 wrong in the 30% (or whatever) it misses is a common objection. Probability
 theory glosses over important detail, it encourages lazy thinking, etc.
 However, this all depends on the space of hypotheses being examined.
 Statistical methods will be prone to this objection because they are
 essentially narrow-AI methods: they don't *try* to search in the space of
 all hypotheses a human might consider. An AGI setup can and should have such
 a large hypothesis space. Note that AIXI is typically formulated as using a
 space of crisp (non-probabilistic) hypotheses, though probability theory is
 used to reason about them. This means no theory it considers will gloss over
 detail in this way: every theory completely explains the data. (I use AIXI
 as a convenient example, not because I agree with it.)

 --Abram

 On Mon, Jul 12, 2010 at 2:42 PM, Abram Demski abramdem...@gmail.comwrote:

 David,

 I tend to think of probability theory and statistics as different things.
 I'd agree that statistics is not enough for AGI, but in contrast I think
 probability theory is a pretty good foundation. Bayesianism to me provides a
 sound way of integrating the elegance/utility tradeoff of explanation-based
 reasoning into the basic fabric of the uncertainty calculus. Others advocate
 different sorts of uncertainty than probabilities, but so far what I've seen
 indicates more a lack of ability to apply probability theory than a need for
 a new type of uncertainty. What other methods do you favor for dealing with
 these things?

 --Abram


 On Sun, Jul 11, 2010 at 12:30 PM, David Jones davidher...@gmail.comwrote:

 Thanks Abram,

 I know that probability is one approach. But there are many problems with
 using it in actual implementations. I know a lot of people will be angered
 by that statement and retort with all the successes that they have had using
 probability. But, the truth is that you can solve the problems many ways and
 every way has its pros and cons. I personally believe that probability has
 unacceptable cons if used all by itself. It must only be used when it is the
 best tool for the task.

 I do plan to use some probability within my approach. But only when it
 makes sense to do so. I do not believe in completely 

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-13 Thread David Jones
Abram,

Thanks for the clarification Abram. I don't have a single way to deal with
uncertainty. I try not to decide on a method ahead of time because what I
really want to do is analyze the problems and find a solution. But, at the
same time. I have looked at the probabilistic approaches and they don't seem
to be sufficient to solve the problem as they are now. So, my inclination is
to use methods that don't gloss over important details. For me, the most
important way of dealing with uncertainty is through explanatory-type
reasoning. But, explanatory reasoning has not been well defined yet. So, the
implementation is not yet clear. That's where I am now.

I've begun to approach problems as follows. I try to break the problem down
and answer the following questions:
1) How do we come up with or construct possible hypotheses.
2) How do we compare hypotheses to determine which is better.
3) How do we lower the uncertainty of hypotheses.
4) How do we determine the likelihood or strength of a single hypothesis all
by itself. Is it sufficient on its own?

With those questions in mind, the solution seems to be to break possible
hypotheses down into pieces that are generally applicable. For example, in
image analysis, a particular type of hypothesis might be related to 1)
motion or 2) attachment relationships or 3) change or movement behavior of
an object, etc.

By breaking the possible hypotheses into very general pieces, you can apply
them to just about any problem. With that as a tool, you can then develop
general methods for resolving uncertainty of such hypotheses using
explanatory scoring, consistency, and even statistical analysis.

Does that make sense to you?

Dave

On Tue, Jul 13, 2010 at 2:29 AM, Abram Demski abramdem...@gmail.com wrote:

 PS-- I am not denying that statistics is applied probability theory. :)
 When I say they are different, what I mean is that saying I'm going to use
 probability theory and I'm going to use statistics tend to indicate very
 different approaches. Probability is a set of axioms, whereas statistics is
 a set of methods. The probability theory camp tends to be bayesian, whereas
 the stats camp tends to be frequentist.

 Your complaint that probability theory doesn't try to figure out why it was
 wrong in the 30% (or whatever) it misses is a common objection. Probability
 theory glosses over important detail, it encourages lazy thinking, etc.
 However, this all depends on the space of hypotheses being examined.
 Statistical methods will be prone to this objection because they are
 essentially narrow-AI methods: they don't *try* to search in the space of
 all hypotheses a human might consider. An AGI setup can and should have such
 a large hypothesis space. Note that AIXI is typically formulated as using a
 space of crisp (non-probabilistic) hypotheses, though probability theory is
 used to reason about them. This means no theory it considers will gloss over
 detail in this way: every theory completely explains the data. (I use AIXI
 as a convenient example, not because I agree with it.)

 --Abram




---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-13 Thread Mike Tintner
You seem to be reaching for something important here, but it isn't at all clear 
what you mean.

I would say that any creative activity (incl. pure problemsolving) begins from 
a *conceptual paradigm* - a v. rough outline - of the form of that activity and 
the form of its end-product or -procedure.  As distinct from rational 
activities where a formula (and algorithm) define the form of the product (and 
activity) with complete precision.

You have a conceptual paradigm of writing a post or shopping for groceries 
or having a conversation. You couldn't possibly have a formula or algorithm 
completely defining every step - every word and sentence, every food, every 
topic  - you may have or want to take.

And programs as we know them, don't and can't handle *concepts* -  despite the 
misnomers of conceptual graphs/spaces etc wh are not concepts at all.  They 
can't for example handle writing or shopping because these can only be 
expressed as flexible outlines/schemas as per ideograms.

What do you mean?

.




From: Jim Bromer 
Sent: Tuesday, July 13, 2010 2:50 PM
To: agi 
Subject: Re: [agi] Re: Huge Progress on the Core of AGI


On Tue, Jul 13, 2010 at 2:29 AM, Abram Demski abramdem...@gmail.com wrote:
[The]complaint that probability theory doesn't try to figure out why it was 
wrong in the 30% (or whatever) it misses is a common objection. Probability 
theory glosses over important detail, it encourages lazy thinking, etc. 
However, this all depends on the space of hypotheses being examined. 
Statistical methods will be prone to this objection because they are 
essentially narrow-AI methods: they don't *try* to search in the space of all 
hypotheses a human might consider. An AGI setup can and should have such a 
large hypothesis space.
---
That is the thing.
We cannot search all possible hypotheses because we could not even write all 
possible hypotheses down.  This is why hypotheses have to be formed creatively 
in response to an analysis of a situation.  In my arrogant opinion, this is 
best done through a method that creatively uses discreet representations.  Of 
course it can use statistical or probabilistic data in making those creative 
hypotheses when there is good data to be used.  But the best way to do this is 
through categorization based creativity.  But this is an imaginative method, 
one which creates imaginative explanations (or other co-relations) for observed 
or conjectured events.  Those imaginative hypotheses then have to be compared 
to a situation through some trial and error methods.  Then the tentative 
conjectures that seem to withstand initial tests have to be further integrated 
into other hypotheses, conjectures and explanations that are related to the 
subject of the hypotheses.   This process of conceptual integration, a process 
which has to rely on both creative methods and rational methods, is a 
fundamental part of the process which does not seem to be clearly understood.  
Conceptual Integration cannot be accomplished by reducing a concept to True or 
False or to some number from 0 to 1 and then combined with other concepts that 
were also so reduced.  Ideas take on roles when combined with other ideas.  
Basically, a new idea has to be fit into a complex of other ideas that are 
strongly related to it.

Jim Bromer




 
On Tue, Jul 13, 2010 at 2:29 AM, Abram Demski abramdem...@gmail.com wrote:

  PS-- I am not denying that statistics is applied probability theory. :) When 
I say they are different, what I mean is that saying I'm going to use 
probability theory and I'm going to use statistics tend to indicate very 
different approaches. Probability is a set of axioms, whereas statistics is a 
set of methods. The probability theory camp tends to be bayesian, whereas the 
stats camp tends to be frequentist.

  Your complaint that probability theory doesn't try to figure out why it was 
wrong in the 30% (or whatever) it misses is a common objection. Probability 
theory glosses over important detail, it encourages lazy thinking, etc. 
However, this all depends on the space of hypotheses being examined. 
Statistical methods will be prone to this objection because they are 
essentially narrow-AI methods: they don't *try* to search in the space of all 
hypotheses a human might consider. An AGI setup can and should have such a 
large hypothesis space. Note that AIXI is typically formulated as using a space 
of crisp (non-probabilistic) hypotheses, though probability theory is used to 
reason about them. This means no theory it considers will gloss over detail in 
this way: every theory completely explains the data. (I use AIXI as a 
convenient example, not because I agree with it.)

  --Abram


  On Mon, Jul 12, 2010 at 2:42 PM, Abram Demski abramdem...@gmail.com wrote:

David,

I tend to think of probability theory and statistics as different things. 
I'd agree that statistics is not enough for AGI, but in contrast I think 
probability

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-13 Thread Jim Bromer
On Tue, Jul 13, 2010 at 10:07 AM, Mike Tintner tint...@blueyonder.co.ukwrote:


 And programs as we know them, don't and can't handle *concepts* -  despite
 the misnomers of conceptual graphs/spaces etc wh are not concepts at all.
 They can't for example handle writing or shopping because these can only
 be expressed as flexible outlines/schemas as per ideograms.


I disagree with this, and so this is proper focus for our disagreement.
Although there are other aspects of the problem that we probably disagree
on, this is such a fundamental issue, that nothing can get past it.  Either
programs can deal with flexible outlines/schema or they can't.  If they
can't then AGI is probably impossible.  If they can, then AGI is probably
possible.

I think that we both agree that creativity and imagination is absolutely
necessary aspects of intelligence.

Jim Bromer



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-13 Thread Jim Bromer
I meant,

I think that we both agree that creativity and imagination are absolutely
necessary aspects of intelligence.

of course!


On Tue, Jul 13, 2010 at 12:46 PM, Jim Bromer jimbro...@gmail.com wrote:

  On Tue, Jul 13, 2010 at 10:07 AM, Mike Tintner 
 tint...@blueyonder.co.ukwrote:


 And programs as we know them, don't and can't handle *concepts* -  despite
 the misnomers of conceptual graphs/spaces etc wh are not concepts at all.
 They can't for example handle writing or shopping because these can only
 be expressed as flexible outlines/schemas as per ideograms.


 I disagree with this, and so this is proper focus for our disagreement.
 Although there are other aspects of the problem that we probably disagree
 on, this is such a fundamental issue, that nothing can get past it.  Either
 programs can deal with flexible outlines/schema or they can't.  If they
 can't then AGI is probably impossible.  If they can, then AGI is probably
 possible.

 I think that we both agree that creativity and imagination is absolutely
 necessary aspects of intelligence.

 Jim Bromer








---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-13 Thread Mike Tintner
The first thing is to acknowledge that programs *don't* handle concepts - if 
you think they do, you must give examples.

The reasons they can't, as presently conceived, is 

a) concepts encase a more or less *infinite diversity of forms* (even if only 
applying at first to a species of object)  -  *chair* for example as I've 
demonstrated embraces a vast open-ended diversity of radically different chair 
forms; higher order concepts like  furniture embrace ... well, it's hard to 
think even of the parameters, let alone the diversity of forms, here.

b) concepts are *polydomain*- not just multi- but open-endedly extensible in 
their domains; chair for example, can also refer to a person, skin in French, 
two humans forming a chair to carry s.o., a prize, etc.

Basically concepts have a freeform realm or sphere of reference, and you can't 
have a setform, preprogrammed approach to defining that realm. 

There's no reason however why you can't mechanically and computationally begin 
to instantiate the kind of freeform approach I'm proposing. The most important 
obstacle is the setform mindset of AGI-ers - epitomised by Dave looking at 
squares, moving along set lines - setform objects in setform motion -  when it 
would be more appropriate to look at something like snakes.- freeform objects 
in freeform motion.

Concepts also - altho this is a huge subject - are *the* language of the 
general programs (as distinct from specialist programs, wh. is all we have 
right now)  that must inform an AGI. Anyone proposing a grandscale AGI project 
like Ben's (wh. I def. wouldn't recommend) must crack the problem of 
conceptualisation more or less from the beginning. I'm not aware of anyone who 
has any remotely viable proposals here, are you?


From: Jim Bromer 
Sent: Tuesday, July 13, 2010 5:46 PM
To: agi 
Subject: Re: [agi] Re: Huge Progress on the Core of AGI


On Tue, Jul 13, 2010 at 10:07 AM, Mike Tintner tint...@blueyonder.co.uk 
wrote: 

  And programs as we know them, don't and can't handle *concepts* -  despite 
the misnomers of conceptual graphs/spaces etc wh are not concepts at all.  
They can't for example handle writing or shopping because these can only be 
expressed as flexible outlines/schemas as per ideograms.

I disagree with this, and so this is proper focus for our disagreement.
Although there are other aspects of the problem that we probably disagree on, 
this is such a fundamental issue, that nothing can get past it.  Either 
programs can deal with flexible outlines/schema or they can't.  If they can't 
then AGI is probably impossible.  If they can, then AGI is probably possible.

I think that we both agree that creativity and imagination is absolutely 
necessary aspects of intelligence.

Jim Bromer




  agi | Archives  | Modify Your Subscription   



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-13 Thread David Jones
Mike, you are so full of it. There is a big difference between *can* and
*don't*. You have no proof that programs can't handle anything you say that
can't.

On Tue, Jul 13, 2010 at 2:36 PM, Mike Tintner tint...@blueyonder.co.ukwrote:

  The first thing is to acknowledge that programs *don't* handle concepts -
 if you think they do, you must give examples.

 The reasons they can't, as presently conceived, is

 a) concepts encase a more or less *infinite diversity of forms* (even
 if only applying at first to a species of object)  -  *chair* for example
 as I've demonstrated embraces a vast open-ended diversity of radically
 different chair forms; higher order concepts like  furniture embrace ...
 well, it's hard to think even of the parameters, let alone the diversity of
 forms, here.

 b) concepts are *polydomain*- not just multi- but open-endedly extensible
 in their domains; chair for example, can also refer to a person, skin in
 French, two humans forming a chair to carry s.o., a prize, etc.

 Basically concepts have a freeform realm or sphere of reference, and you
 can't have a setform, preprogrammed approach to defining that realm.

 There's no reason however why you can't mechanically and computationally
 begin to instantiate the kind of freeform approach I'm proposing. The most
 important obstacle is the setform mindset of AGI-ers - epitomised by Dave
 looking at squares, moving along set lines - setform objects in setform
 motion -  when it would be more appropriate to look at something like
 snakes.- freeform objects in freeform motion.

 Concepts also - altho this is a huge subject - are *the* language of the
 general programs (as distinct from specialist programs, wh. is all we
 have right now)  that must inform an AGI. Anyone proposing a grandscale AGI
 project like Ben's (wh. I def. wouldn't recommend) must crack the problem of
 conceptualisation more or less from the beginning. I'm not aware of anyone
 who has any remotely viable proposals here, are you?

  *From:* Jim Bromer jimbro...@gmail.com
 *Sent:* Tuesday, July 13, 2010 5:46 PM
 *To:* agi agi@v2.listbox.com
 *Subject:* Re: [agi] Re: Huge Progress on the Core of AGI

 On Tue, Jul 13, 2010 at 10:07 AM, Mike Tintner 
 tint...@blueyonder.co.ukwrote:


 And programs as we know them, don't and can't handle *concepts* -  despite
 the misnomers of conceptual graphs/spaces etc wh are not concepts at all.
 They can't for example handle writing or shopping because these can only
 be expressed as flexible outlines/schemas as per ideograms.


 I disagree with this, and so this is proper focus for our disagreement.
 Although there are other aspects of the problem that we probably disagree
 on, this is such a fundamental issue, that nothing can get past it.  Either
 programs can deal with flexible outlines/schema or they can't.  If they
 can't then AGI is probably impossible.  If they can, then AGI is probably
 possible.

 I think that we both agree that creativity and imagination is absolutely
 necessary aspects of intelligence.

 Jim Bromer




   *agi* | Archives https://www.listbox.com/member/archive/303/=now
 https://www.listbox.com/member/archive/rss/303/ | 
 Modifyhttps://www.listbox.com/member/?;Your Subscription
 http://www.listbox.com
*agi* | Archives https://www.listbox.com/member/archive/303/=now
 https://www.listbox.com/member/archive/rss/303/ | 
 Modifyhttps://www.listbox.com/member/?;Your Subscription
 http://www.listbox.com




---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-13 Thread David Jones
Mike,

see below.

On Tue, Jul 13, 2010 at 2:36 PM, Mike Tintner tint...@blueyonder.co.ukwrote:

  The first thing is to acknowledge that programs *don't* handle concepts -
 if you think they do, you must give examples.

 The reasons they can't, as presently conceived, is

 a) concepts encase a more or less *infinite diversity of forms* (even
 if only applying at first to a species of object)  -  *chair* for example
 as I've demonstrated embraces a vast open-ended diversity of radically
 different chair forms; higher order concepts like  furniture embrace ...
 well, it's hard to think even of the parameters, let alone the diversity of
 forms, here.


invoking infinity is insufficient argument to say that a program can't
recognize an infinite number of forms.

In fact, I can prove it. Lets say that all numbers are made of digits
0,1,2,3...9. If you can recognize just 9 digits, you can read infinitely
large numbers.

Another example, you can create an infinite number of very diverse shapes
and forms out of clay. But, I can represent every last one of them using
simple mesh models. The mesh models are made of a very small number of
concepts: lines, points, distance constraints, etc. So, an infinite number
of diverse concepts or forms can be modeled using a very small number of
concepts.

In conclusion, you have no proof at all that programs can't handle these
things. You just THINK they can't. But, I for one, know you're dead wrong.



 b) concepts are *polydomain*- not just multi- but open-endedly extensible
 in their domains; chair for example, can also refer to a person, skin in
 French, two humans forming a chair to carry s.o., a prize, etc.


A chair is defined by anything you can sit on. Anything you can sit on is
defined by a certain type of form that you can actually learn inductively.
It is not impossible to teach a computer to recognize things that could be
sat on or even things that seem like they have the form of something that
might be sat on. To say that a computer can never learn this is impossible
for you to claim. You see, very diverse concepts can be represented by a
small number of other concepts such as time, space, 3D form, etc. You claim
is completely baseless.



 Basically concepts have a freeform realm or sphere of reference, and you
 can't have a setform, preprogrammed approach to defining that realm.


you can if it covers base concepts which can represent larger concepts.


 There's no reason however why you can't mechanically and computationally
 begin to instantiate the kind of freeform approach I'm proposing. The most
 important obstacle is the setform mindset of AGI-ers - epitomised by Dave
 looking at squares, moving along set lines - setform objects in setform
 motion -  when it would be more appropriate to look at something like
 snakes.- freeform objects in freeform motion.


squares can move in an infinite number of ways. It is an experiment An
exercise... to learn how AGI deals with uncertainty, even if the uncertainty
is very limited.

Clearly you have no imagination to understand why doing such experiments
might be useful. You think moving squares is simple just because they are
squares. But, you fail to realize that uncertainty can be generated out of
even very simple systems. And so far you have never stated how you would
deal with such uncertainty.



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-13 Thread Jim Bromer
On Tue, Jul 13, 2010 at 2:36 PM, Mike Tintner tint...@blueyonder.co.ukwrote:

  The first thing is to acknowledge that programs *don't* handle concepts -
 if you think they do, you must give examples.

 The reasons they can't, as presently conceived, is

 a) concepts encase a more or less *infinite diversity of forms* (even
 if only applying at first to a species of object)  -  *chair* for example
 as I've demonstrated embraces a vast open-ended diversity of radically
 different chair forms; higher order concepts like  furniture embrace ...
 well, it's hard to think even of the parameters, let alone the diversity of
 forms, here.

 b) concepts are *polydomain*- not just multi- but open-endedly extensible
 in their domains; chair for example, can also refer to a person, skin in
 French, two humans forming a chair to carry s.o., a prize, etc.

 Basically concepts have a freeform realm or sphere of reference, and you
 can't have a setform, preprogrammed approach to defining that realm.

 There's no reason however why you can't mechanically and computationally
 begin to instantiate the kind of freeform approach I'm proposing.



So here you are saying that programs don't handle concepts but they could
begin to instantiate the kind of freeform approach that you are proposing.
Are you sure you are not saying that programs can't handle concepts unless
we do exactly what you are suggesting that we should do.  Because a lot of
us say that.

Jim Bromer



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-13 Thread David Jones
Thanks Abram, I'll read up on it when I get a chance.


On Tue, Jul 13, 2010 at 12:03 PM, Abram Demski abramdem...@gmail.comwrote:

 David,

 Yes, this makes sense to me.

 To go back to your original query, I still think you will find a rich
 community relevant to your work if you look into the MDL literature (which
 additionally does not rely on probability theory, though as I said I'd say
 it's equivalent).

 Perhaps this book might be helpful:

 http://www.amazon.com/Description-Principle-Adaptive-Computation-Learning/dp/0262072815/ref=sr_1_1?ie=UTF8s=booksqid=1279036776sr=8-1

 It includes a (short-ish?) section comparing the pros/cons of MDL and
 Bayesianism, and examining some of the mathematical linkings between them--
 with the aim of showing that MDL is a broader principle. I disagree there,
 of course. :)

 --Abram

 On Tue, Jul 13, 2010 at 10:01 AM, David Jones davidher...@gmail.comwrote:

 Abram,

 Thanks for the clarification Abram. I don't have a single way to deal with
 uncertainty. I try not to decide on a method ahead of time because what I
 really want to do is analyze the problems and find a solution. But, at the
 same time. I have looked at the probabilistic approaches and they don't seem
 to be sufficient to solve the problem as they are now. So, my inclination is
 to use methods that don't gloss over important details. For me, the most
 important way of dealing with uncertainty is through explanatory-type
 reasoning. But, explanatory reasoning has not been well defined yet. So, the
 implementation is not yet clear. That's where I am now.

 I've begun to approach problems as follows. I try to break the problem
 down and answer the following questions:
 1) How do we come up with or construct possible hypotheses.
 2) How do we compare hypotheses to determine which is better.
 3) How do we lower the uncertainty of hypotheses.
 4) How do we determine the likelihood or strength of a single hypothesis
 all by itself. Is it sufficient on its own?

 With those questions in mind, the solution seems to be to break possible
 hypotheses down into pieces that are generally applicable. For example, in
 image analysis, a particular type of hypothesis might be related to 1)
 motion or 2) attachment relationships or 3) change or movement behavior of
 an object, etc.

 By breaking the possible hypotheses into very general pieces, you can
 apply them to just about any problem. With that as a tool, you can then
 develop general methods for resolving uncertainty of such hypotheses using
 explanatory scoring, consistency, and even statistical analysis.

 Does that make sense to you?

 Dave


 On Tue, Jul 13, 2010 at 2:29 AM, Abram Demski abramdem...@gmail.comwrote:

 PS-- I am not denying that statistics is applied probability theory. :)
 When I say they are different, what I mean is that saying I'm going to use
 probability theory and I'm going to use statistics tend to indicate very
 different approaches. Probability is a set of axioms, whereas statistics is
 a set of methods. The probability theory camp tends to be bayesian, whereas
 the stats camp tends to be frequentist.

 Your complaint that probability theory doesn't try to figure out why it
 was wrong in the 30% (or whatever) it misses is a common objection.
 Probability theory glosses over important detail, it encourages lazy
 thinking, etc. However, this all depends on the space of hypotheses being
 examined. Statistical methods will be prone to this objection because they
 are essentially narrow-AI methods: they don't *try* to search in the space
 of all hypotheses a human might consider. An AGI setup can and should have
 such a large hypothesis space. Note that AIXI is typically formulated as
 using a space of crisp (non-probabilistic) hypotheses, though probability
 theory is used to reason about them. This means no theory it considers will
 gloss over detail in this way: every theory completely explains the data. (I
 use AIXI as a convenient example, not because I agree with it.)

 --Abram


*agi* | Archives https://www.listbox.com/member/archive/303/=now
 https://www.listbox.com/member/archive/rss/303/ | 
 Modifyhttps://www.listbox.com/member/?;Your Subscription
 http://www.listbox.com




 --
 Abram Demski
 http://lo-tho.blogspot.com/
 http://groups.google.com/group/one-logic
*agi* | Archives https://www.listbox.com/member/archive/303/=now
 https://www.listbox.com/member/archive/rss/303/ | 
 Modifyhttps://www.listbox.com/member/?;Your Subscription
 http://www.listbox.com




---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-13 Thread Mike Tintner
 start making standard specification 
cherry cakes with standard ingredients, and standard mathematical sums with 
standard numbers and operations, and standard logical variables with standard 
meanings [and cut out any kind of et cetera]  **  

(And for much the same reason programs can't - aren't meant to - handle 
concepts. Every concept , like chair has to refer to a general class of 
objects embracing et ceteras - new, unspecified, yet-to-be-invented kinds of 
objects  and ones that you simply haven't heard of  yet, as well as specified, 
known kinds  of object . Concepts are wonderful cognitive tools for embracing 
unspecified objects. Concepts, for example,  like things, objects, 
actions, do something -  anything all sorts of things - everything you 
can possibly think of  even  write totally new kinds of programs - 
anti-programs - et cetera -  such concepts endow humans with massive 
creative freedom and scope of reference.

You along with the whole of AI/AGI are effectively claiming that there is or 
can be a formula/program for dealing with the unknown - i.e. unknown kinds of 
objects. It's patent absurdity - and counter to the whole spirit of logic and 
rationality -  in fact lunacy. You'll wonder in years to come how so smart 
people could be so dumb.   Could think they're producing programs that can make 
anything - can make cars or cakes - any car or cake  - when the rest of the 
world and his uncle can see that they're only producing very specific brands of 
car and cake (with very specific objects/parts).  VW Beetles not cars let 
alone vehicles let alone forms of transportation let alone means of 
travel let alone universal programs. . 

I'm full of it? AI/AGI is full of the most amazing hype about its generality 
and creativity wh. you have clearly swallowed whole . Programs are simply 
specialist procedures for producing specialist products and procedures with 
specified kinds of actions and objects - they cannot deal with unspecified 
kinds of actions and objects, period. You won't produce any actual examples to 
the contrary.

  


From: David Jones 
Sent: Tuesday, July 13, 2010 8:00 PM
To: agi 
Subject: Re: [agi] Re: Huge Progress on the Core of AGI


Correction:

Mike, you are so full of it. There is a big difference between *can* and 
*don't*. You have no proof that programs can't handle anything you say [they] 
can't.


On Tue, Jul 13, 2010 at 2:59 PM, David Jones davidher...@gmail.com wrote:

  Mike, you are so full of it. There is a big difference between *can* and 
*don't*. You have no proof that programs can't handle anything you say that 
can't. 



  On Tue, Jul 13, 2010 at 2:36 PM, Mike Tintner tint...@blueyonder.co.uk 
wrote:

The first thing is to acknowledge that programs *don't* handle concepts - 
if you think they do, you must give examples.

The reasons they can't, as presently conceived, is 

a) concepts encase a more or less *infinite diversity of forms* (even if 
only applying at first to a species of object)  -  *chair* for example as 
I've demonstrated embraces a vast open-ended diversity of radically different 
chair forms; higher order concepts like  furniture embrace ... well, it's 
hard to think even of the parameters, let alone the diversity of forms, here.

b) concepts are *polydomain*- not just multi- but open-endedly extensible 
in their domains; chair for example, can also refer to a person, skin in 
French, two humans forming a chair to carry s.o., a prize, etc.

Basically concepts have a freeform realm or sphere of reference, and you 
can't have a setform, preprogrammed approach to defining that realm. 

There's no reason however why you can't mechanically and computationally 
begin to instantiate the kind of freeform approach I'm proposing. The most 
important obstacle is the setform mindset of AGI-ers - epitomised by Dave 
looking at squares, moving along set lines - setform objects in setform motion 
-  when it would be more appropriate to look at something like snakes.- 
freeform objects in freeform motion.

Concepts also - altho this is a huge subject - are *the* language of the 
general programs (as distinct from specialist programs, wh. is all we have 
right now)  that must inform an AGI. Anyone proposing a grandscale AGI project 
like Ben's (wh. I def. wouldn't recommend) must crack the problem of 
conceptualisation more or less from the beginning. I'm not aware of anyone who 
has any remotely viable proposals here, are you?


From: Jim Bromer 
Sent: Tuesday, July 13, 2010 5:46 PM
To: agi 
Subject: Re: [agi] Re: Huge Progress on the Core of AGI


On Tue, Jul 13, 2010 at 10:07 AM, Mike Tintner tint...@blueyonder.co.uk 
wrote: 

  And programs as we know them, don't and can't handle *concepts* -  
despite the misnomers of conceptual graphs/spaces etc wh are not concepts at 
all.  They can't for example handle writing or shopping because these can 
only be expressed as flexible outlines/schemas

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-13 Thread David Jones
, find one of
 them that works with *unspecified kinds of actions and objects.*  (Or you
 can always try and explain how  formulae that are clearly designed to be
 setform can somehow simultaneously be freeform and embrace et cetera ).

 There are by the same token no branches of logic and maths that work with
 *unspecified kinds of actions and objects.*   (Mathematicians who invent new
 formulae have to work with and develop new kinds of objects - but normal
 maths can't help them do this).

 The whole of rationality - incl. all rational technology - only works with
 specified kinds of actions and objects.

 **One of the most basic rationales  of rationality is let's stop everyone
 farting around making their own versions of products with their own
 differently specified actions and objects; let's  specify/standardise  the
 actions and objects that everyone must use. Let's start making standard
 specification cherry cakes with standard ingredients, and standard
 mathematical sums with standard numbers and operations, and standard logical
 variables with standard meanings [and cut out any kind of et cetera]  **

 (And for much the same reason programs can't - aren't meant to - handle
 concepts. Every concept , like chair has to refer to a general class of
 objects embracing et ceteras - new, unspecified, yet-to-be-invented kinds of
 objects  and ones that you simply haven't heard of  yet, as well as
 specified, known kinds  of object . Concepts are wonderful cognitive tools
 for embracing unspecified objects. Concepts, for example,  like things,
 objects, actions, do something -  anything all sorts of things -
 everything you can possibly think of  even  write totally new kinds of
 programs - anti-programs - et cetera -  such concepts endow humans with
 massive creative freedom and scope of reference.

 You along with the whole of AI/AGI are effectively claiming that there is
 or can be a formula/program for dealing with the unknown - i.e. unknown
 kinds of objects. It's patent absurdity - and counter to the whole spirit
 of logic and rationality -  in fact lunacy. You'll wonder in years to come
 how so smart people could be so dumb.   Could think they're producing
 programs that can make anything - can make cars or cakes - any car or
 cake  - when the rest of the world and his uncle can see that they're only
 producing very specific brands of car and cake (with very specific
 objects/parts).  VW Beetles not cars let alone vehicles let alone forms
 of transportation let alone means of travel let alone universal
 programs. .

 I'm full of it? AI/AGI is full of the most amazing hype about its
 generality and creativity wh. you have clearly swallowed whole .
 Programs are simply specialist procedures for producing specialist products
 and procedures with specified kinds of actions and objects - they cannot
 deal with unspecified kinds of actions and objects, period. You won't
 produce any actual examples to the contrary.



  *From:* David Jones davidher...@gmail.com
 *Sent:* Tuesday, July 13, 2010 8:00 PM
 *To:* agi agi@v2.listbox.com
 *Subject:* Re: [agi] Re: Huge Progress on the Core of AGI

 Correction:

 Mike, you are so full of it. There is a big difference between *can* and
 *don't*. You have no proof that programs can't handle anything you say
 [they] can't.

 On Tue, Jul 13, 2010 at 2:59 PM, David Jones davidher...@gmail.comwrote:

 Mike, you are so full of it. There is a big difference between *can* and
 *don't*. You have no proof that programs can't handle anything you say that
 can't.


 On Tue, Jul 13, 2010 at 2:36 PM, Mike Tintner 
 tint...@blueyonder.co.ukwrote:

  The first thing is to acknowledge that programs *don't* handle concepts
 - if you think they do, you must give examples.

 The reasons they can't, as presently conceived, is

 a) concepts encase a more or less *infinite diversity of forms* (even
 if only applying at first to a species of object)  -  *chair* for example
 as I've demonstrated embraces a vast open-ended diversity of radically
 different chair forms; higher order concepts like  furniture embrace ...
 well, it's hard to think even of the parameters, let alone the diversity of
 forms, here.

 b) concepts are *polydomain*- not just multi- but open-endedly extensible
 in their domains; chair for example, can also refer to a person, skin in
 French, two humans forming a chair to carry s.o., a prize, etc.

 Basically concepts have a freeform realm or sphere of reference, and you
 can't have a setform, preprogrammed approach to defining that realm.

 There's no reason however why you can't mechanically and computationally
 begin to instantiate the kind of freeform approach I'm proposing. The most
 important obstacle is the setform mindset of AGI-ers - epitomised by Dave
 looking at squares, moving along set lines - setform objects in setform
 motion -  when it would be more appropriate to look at something like
 snakes.- freeform objects in freeform motion.

 Concepts also

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-13 Thread Mike Tintner
Dave: The goal of the formula is to scan any unknown object 

How does the program define and therefore recognize object ? 

(And why then are you dealing with just squares if it can deal with this 
apparently vast and unlimited range of  objects? )

If you go into detail, you'll find no program can deal with or define object. 
 Jeez, none can recognize a chair - but now apparently they can recognize 
objects. 

What exactly does the program do?  Your description is confusing. What forms 
are input and output? Specific examples. If I put in a drawing of overlaid 
circles or a cartoon face, or a Jackson Pollock, or a photo of any scene, this 
program will give me  3-d versions?

Here's a bet - you're giving me yet more hype.




From: David Jones 
Sent: Wednesday, July 14, 2010 1:32 AM
To: agi 
Subject: Re: [agi] Re: Huge Progress on the Core of AGI


I'm not even going to read your whole email. 

I'll give you a great example of a formula handling unknown objects. The goal 
of the formula is to scan any unknown object and produce a 3D model of it using 
laser scanning. The objects are unknown, but that doesn't mean you can't handle 
unknown inputs. They all have things in common. Objects all have surfaces (at 
least the vast majority). So, whatever methods you can apply to analyze object 
surfaces, will work for the vast majority of objects. So, you *CAN* handle 
unknown objects. The same type of solution can be applied to many other 
problems, including AGI. The complete properties of the object or concept may 
be unknown, but the components that can be used to describe it are usually 
known. 

Your claim is baseless.

Dave


On Tue, Jul 13, 2010 at 7:34 PM, Mike Tintner tint...@blueyonder.co.uk wrote:

  Dave:You have no proof that programs can't handle anything you say that can't

  Sure I do. 

  **There is no such thing as a formula (or program as we currently understand 
it) that can or is meant to handle UNSPECIFIED, (ESP NEW, UNKNOWN)  KINDS OF 
ACTIONS AND OBJECTS**

  Every program is essentially a formula for a set form activity which directs 
how to take a closed set of **specified kinds of actions and objects** - e,g,

  a + b + c + d +  = 

  [take an a and a b and a c and a d ..]

  in order to produce set forms of products and procedures  - (set combinations 
of those a,b,c,and d actions and objects)

  A recipe that specifies a set kind of cherry cake with set ingredients. 
[GA's, if you're wondering, are merely glorified recipes for mixing and 
endlessly remixing the same set of specific ingredients. Even random programs 
work with specified actions and objects.]

  There is no formula or program that says:

  take an a and a b and a c  oh, and something else -  a certain 'je ne 
sais quoi' - I don't know what it is, but you may be able to recognize it when 
you find it.Just keep looking 

  There is no formula of the form

  A + B + C + D + ETC. = 

  [ETC.= et cetera/some other unspecified things ]

  still less

  A + B + C + D + ETC ^ETC =  

  [some other things x some other operations]

  That, I trust you will agree, is a contradiction of a formula and a program - 
more like an anti-formula/program. There are no et cetera formulas, and no 
logical or mathematical symbols for etc  are there?

  But to be creative and produce new kinds of products and procedures, small 
and trivial as well as large, you have to be able to work with and find just 
such **unspecified (and esp. new) kinds of actions and objects.** - et ceteras.

  If you want to develop a new kind of fruit cake or new kind of cherry 
cake  or  even make a slightly different stew or more or less the same cherry 
cake but without the maraschinos wh. have gone missing, then you have to be 
able to work with and find new kinds of ingredients and mix/prepare them in new 
kinds of ways - new exotic kinds of fruit and other foods in new mixes and 
mashes and fermentations  -  et cetera x et cetera.

  If you want to develop a new kind of word or alphabet, (or develop a new kind 
of formula as I just did above,  then you have to be able to work with and 
find new kinds of letters and symbols and abbreviations (as I  just did) - etc.

  If you even want to engage with any part of the real world at the most 
mundane level  - walk down a street say - you have to be able to be creative 
and deal with new unspecified kinds of actions and objects that you may find 
there - because you can't predict what that street will contain.

  And to be creative, you do indeed  have to start not from a perfectly, fully 
specified formula, but something more like an et cetera anti-formula  -a 
v. imperfectly and partially specified  *conceptual paradigm*, such as  -:

  if you want to make a new different kind of cake/ house/ structure, you'll 
probably need an a and a b and a c  but you'll also need some other 
things -  some 'je ne sais quoi's - I don't know what they are, -- but you 
may be able to recognize them when you

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-12 Thread Abram Demski
David,

I tend to think of probability theory and statistics as different things.
I'd agree that statistics is not enough for AGI, but in contrast I think
probability theory is a pretty good foundation. Bayesianism to me provides a
sound way of integrating the elegance/utility tradeoff of explanation-based
reasoning into the basic fabric of the uncertainty calculus. Others advocate
different sorts of uncertainty than probabilities, but so far what I've seen
indicates more a lack of ability to apply probability theory than a need for
a new type of uncertainty. What other methods do you favor for dealing with
these things?

--Abram

On Sun, Jul 11, 2010 at 12:30 PM, David Jones davidher...@gmail.com wrote:

 Thanks Abram,

 I know that probability is one approach. But there are many problems with
 using it in actual implementations. I know a lot of people will be angered
 by that statement and retort with all the successes that they have had using
 probability. But, the truth is that you can solve the problems many ways and
 every way has its pros and cons. I personally believe that probability has
 unacceptable cons if used all by itself. It must only be used when it is the
 best tool for the task.

 I do plan to use some probability within my approach. But only when it
 makes sense to do so. I do not believe in completely statistical solutions
 or completely Bayesian machine learning alone.

 A good example of when I might use it is when a particular hypothesis
 predicts something with 70% accuracy, well it may be better than any other
 hypothesis we can come up with so far. So, we may use that hypothesis. But,
 the 30% unexplained errors should be explained if possible with the
 resources and algorithms available, if at all possible. This is where my
 method differs from statistical methods. I want to build algorithms that
 resolve the 30% and explain it. For many problems, there are rules and
 knowledge that will solve them effectively. Probability should only be used
 when you cannot find a more accurate solution.

 Basically we should use probability when we don't know the factors
 involved, can't find any rules to explain the phenomena or we don't have the
 time and resources to figure it out. So you must simply guess at the most
 probable event without any rules for figuring out which event is more
 applicable under the current circumstances.

 So, in summary, probability definitely has its place. I just think that
 explanatory reasoning and other more accurate methods should be preferred
 whenever possible.

 Regarding learning the knowledge being the bigger problem, I completely
 agree. That is why I think it is so important to develop machine learning
 that can learn by direct observation of the environment. Without that, it is
 practically impossible to gather the knowledge required for AGI-type
 applications. We can learn this knowledge by analyzing the world
 automatically and generally through video.

 My step by step approach for learning and then applying the knowledge for
 agi is as follows:
 1) Understand and learn about the environment(through Computer Vision for
 now and other sensory perceptions in the future)
 2) learn about your own actions and how they affect the environment
 3) learn about language and how it is associated with or related to the
 environment.
 4) learn goals from language(such as through dedicated inputs).
 5) Goal pursuit
 6) Other Miscellaneous capabilities as needed

 Dave

 On Sat, Jul 10, 2010 at 8:40 PM, Abram Demski abramdem...@gmail.comwrote:

 David,

 Sorry for the slow response.

 I agree completely about expectations vs predictions, though I wouldn't
 use that terminology to make the distinction (since the two terms are
 near-synonyms in English, and I'm not aware of any technical definitions
 that are common in the literature). This is why I think probability theory
 is necessary: to formalize this idea of expectations.

 I also agree that it's good to utilize previous knowledge. However, I
 think existing AI research has tackled this over and over; learning that
 knowledge is the bigger problem.

 --Abram

*agi* | Archives https://www.listbox.com/member/archive/303/=now
 https://www.listbox.com/member/archive/rss/303/ | 
 Modifyhttps://www.listbox.com/member/?;Your Subscription
 http://www.listbox.com




-- 
Abram Demski
http://lo-tho.blogspot.com/
http://groups.google.com/group/one-logic



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-11 Thread David Jones
Thanks Abram,

I know that probability is one approach. But there are many problems with
using it in actual implementations. I know a lot of people will be angered
by that statement and retort with all the successes that they have had using
probability. But, the truth is that you can solve the problems many ways and
every way has its pros and cons. I personally believe that probability has
unacceptable cons if used all by itself. It must only be used when it is the
best tool for the task.

I do plan to use some probability within my approach. But only when it makes
sense to do so. I do not believe in completely statistical solutions or
completely Bayesian machine learning alone.

A good example of when I might use it is when a particular hypothesis
predicts something with 70% accuracy, well it may be better than any other
hypothesis we can come up with so far. So, we may use that hypothesis. But,
the 30% unexplained errors should be explained if possible with the
resources and algorithms available, if at all possible. This is where my
method differs from statistical methods. I want to build algorithms that
resolve the 30% and explain it. For many problems, there are rules and
knowledge that will solve them effectively. Probability should only be used
when you cannot find a more accurate solution.

Basically we should use probability when we don't know the factors involved,
can't find any rules to explain the phenomena or we don't have the time and
resources to figure it out. So you must simply guess at the most probable
event without any rules for figuring out which event is more applicable
under the current circumstances.

So, in summary, probability definitely has its place. I just think that
explanatory reasoning and other more accurate methods should be preferred
whenever possible.

Regarding learning the knowledge being the bigger problem, I completely
agree. That is why I think it is so important to develop machine learning
that can learn by direct observation of the environment. Without that, it is
practically impossible to gather the knowledge required for AGI-type
applications. We can learn this knowledge by analyzing the world
automatically and generally through video.

My step by step approach for learning and then applying the knowledge for
agi is as follows:
1) Understand and learn about the environment(through Computer Vision for
now and other sensory perceptions in the future)
2) learn about your own actions and how they affect the environment
3) learn about language and how it is associated with or related to the
environment.
4) learn goals from language(such as through dedicated inputs).
5) Goal pursuit
6) Other Miscellaneous capabilities as needed

Dave

On Sat, Jul 10, 2010 at 8:40 PM, Abram Demski abramdem...@gmail.com wrote:

 David,

 Sorry for the slow response.

 I agree completely about expectations vs predictions, though I wouldn't use
 that terminology to make the distinction (since the two terms are
 near-synonyms in English, and I'm not aware of any technical definitions
 that are common in the literature). This is why I think probability theory
 is necessary: to formalize this idea of expectations.

 I also agree that it's good to utilize previous knowledge. However, I think
 existing AI research has tackled this over and over; learning that knowledge
 is the bigger problem.

 --Abram




---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-10 Thread David Jones
Mike,

Using the image itself as a template to match is possible. In fact it has
been done before. But it suffers from several problems that would also need
solving.

1) Images are 2D. I assume you are also referring to 2D outlines. Real
objects are 3D. So, you're going to have to infer the shape of the object...
which means you are no longer actually transforming the image itself. You
are transforming a model of the image, which would have points, curves,
dimensions, etc. Basically, a mathematical shape :) No matter how much you
disapprove of encoding info, sometimes it makes sense to do it.
2) Creating the first outline and figuring out what to outline is not
trivial at all. So, this method can only be used after you can do that.
There is a lot more uncertainty involved here than you seem to realize.
First, how do you even determine the outline? That is an unsolved problem.
So, not only will you have to try many transformations with the right
outline, you have to try many with wrong outlines, increase the
possibilities (exponentially?). It looks like you need a way to score
possibilities and decide which ones to try.
3) rock is a word and words are always learned by induction along with
other types of reasoning before we can even consider it a type of object.
So, you are starting with a somewhat unrepresentative or artificial problem.

4) Even the same rock can look very different from different perspectives.
In fact, how do you even match the same rock? Please describe how your
system would do this. It is not trivial at all. And you will soon see that
there is an extremely large amount of uncertainty. Dealing with this type of
uncertainty is the central problem of AGI. The central problem is not fluid
schemas.Even if I used this method, I would be stuck with the same exact
uncertainty problems. So, you're not going to get passed them like this. The
same research on explanatory and non-monotonic type reasoning must still be
done.
5) what is a fluid transform? You can't just throw out words. Please define
it. You are going to realize that your representation is pretty much
geometric anyway. Regardless, it will likely be equivalent. Are you going to
try every possible transformation? Nope. That would be impossible. So, how
do you decide what transformations to try? When is a transformation too
large of a change to consider it the same rock? When is it too large to
consider it a different rock?
6) Are you seriously going to transform every object you've every tried to
outline? This is going to be prohibitively costly in terms of processing.
Especially because you haven't defined how you're going to decide what to
transform and what not to. So, before you can even use this algorithm,
you're going to have to use something else to decide what is a possible
candidate and what is not.


On Fri, Jul 9, 2010 at 6:42 PM, Mike Tintner tint...@blueyonder.co.ukwrote:

 Now let's see **you** answer a question. Tell me how any
 algorithmic/mathematical approach of any kind actual or in pure principle
 can be applied to recognize raindrops falling down a pane - and to
 predict their movement?


Like I've said many times before, we can't predict everything, and we
certainly shouldn't try. But


 http://www.pond5.com/stock-footage/263778/beautiful-rain-drops.html

 or to recognize a rock?

 http://www.handprint.com/HP/WCL/IMG/LPR/adams.jpg

 or a [filled] shopping bag?

 http://www.abc.net.au/reslib/200801/r215609_837743.jpg

 http://www.sustainableisgood.com/photos/uncategorized/2007/03/29/shoppingbags.jpg

 http://thegogreenblog.com/wp-content/uploads/2007/12/plastic_shopping_bag.jpg

 or if you want a real killer, google some vid clips of amoebas in oozing
 motion?

 PS In every case, I suggest, the brain observes different principles of
 transformation - for the most part unconsciously. And they can be of various
 kinds not just direct natural transformations, of course. It's possible, it
 occurs to me, that the principle that applies to rocks might just be
 something like whatever can be carved out of stone





---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-10 Thread David Jones
I accidentally pressed something and it sent it early... this is a finished
version:


Mike,

Using the image itself as a template to match is possible. In fact it has
been done before. But it suffers from several problems that would also need
solving.

1) Images are 2D. I assume you are also referring to 2D outlines. Real
objects are 3D. So, you're going to have to infer the shape of the object...
which means you are no longer actually transforming the image itself. You
are transforming a model of the image, which would have points, curves,
dimensions, etc. Basically, a mathematical shape :) No matter how much you
disapprove of encoding info, sometimes it makes sense to do it.
2) Creating the first outline and figuring out what to outline is not
trivial at all. So, this method can only be used after you can do that.
There is a lot more uncertainty involved here than you seem to realize.
First, how do you even determine the outline? That is an unsolved problem.
So, not only will you have to try many transformations with the right
outline, you have to try many with wrong outlines, increase the
possibilities (exponentially?). It looks like you need a way to score
possibilities and decide which ones to try.
3) rock is a word and words are always learned by induction along with
other types of reasoning before we can even consider it a type of object.
So, you are starting with a somewhat unrepresentative or artificial problem.

4) Even the same rock can look very different from different perspectives.
In fact, how do you even match the same rock? Please describe how your
system would do this. It is not trivial at all. And you will soon see that
there is an extremely large amount of uncertainty. Dealing with this type of
uncertainty is the central problem of AGI. The central problem is not fluid
schemas.Even if I used this method, I would be stuck with the same exact
uncertainty problems. So, you're not going to get passed them like this. The
same research on explanatory and non-monotonic type reasoning must still be
done.
5) what is a fluid transform? You can't just throw out words. Please define
it. You are going to realize that your representation is pretty much
geometric anyway. Regardless, it will likely be equivalent. Are you going to
try every possible transformation? Nope. That would be impossible. So, how
do you decide what transformations to try? When is a transformation too
large of a change to consider it the same rock? When is it too large to
consider it a different rock?
6) Are you seriously going to transform every object you've every tried to
outline? This is going to be prohibitively costly in terms of processing.
Especially because you haven't defined how you're going to decide what to
transform and what not to. So, before you can even use this algorithm,
you're going to have to use something else to decide what is a possible
candidate and what is not.


On Fri, Jul 9, 2010 at 6:42 PM, Mike Tintner tint...@blueyonder.co.ukwrote:

  Now let's see **you** answer a question. Tell me how any
 algorithmic/mathematical approach of any kind actual or in pure principle
 can be applied to recognize raindrops falling down a pane - and to
 predict their movement?


Like I've said many times before, we can't predict everything, and we
certainly shouldn't try. But  we should expect what might happen.
Raindrops are probably recognized as an unexpected distortion when it occurs
on a window. When its not on a window, it is still a sort of distortion of
brightness and just a small object with different contrast. You're right
that geometric definitions are not the right way to recognize that. It would
have to use a different method to remember the features/properties of
raindrops and how they appeared, such as the contrast, size, quantity,
location, context, etc.


 http://www.pond5.com/stock-footage/263778/beautiful-rain-drops.html

 or to recognize a rock?


A specific rock could be recognized with geometric definitions. Texture is
certainly important, size, context (very important), etc. If we are talking
about the category rock, that's different than the instance of a rock. The
category of a rock probably needs a description of the types of properties
that rocks have, such as the types of curves, texture, sizes, interactions,
behavior, etc. Exactly how you do it, I haven't decided. I'm not at that
point yet.



 http://www.handprint.com/HP/WCL/IMG/LPR/adams.jpg

 or a [filled] shopping bag?


same as the rock.



 http://www.abc.net.au/reslib/200801/r215609_837743.jpg

 http://www.sustainableisgood.com/photos/uncategorized/2007/03/29/shoppingbags.jpg

 http://thegogreenblog.com/wp-content/uploads/2007/12/plastic_shopping_bag.jpg

 or if you want a real killer, google some vid clips of amoebas in oozing
 motion?


same.



 PS In every case, I suggest, the brain observes different principles of
 transformation - for the most part unconsciously. And they can be of various
 kinds not just direct natural 

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-10 Thread Mike Tintner
. 

You are ironically misunderstanding the very foundations and rationale of 
geometry. Geometry - with its set form forms - was invented precisely because 
mathematicians didn't like the freeform nature of the world - wanted to create 
set forms (in the footsteps of the rational technologists who preceded them) - 
that they could control and reduce to formulae and reproduce with ease. 
Freeform rocks are a lot more complex to draw and make and reproduce than  set 
form rectangular bricks.

Set forms are not free forms. They are the opposite.

(And while you and others will continue to *claim*  in theory absolute 
setform=freeform nonsense, you will in practice always, always stick to setform 
objects. Some part of you knows the v.obvious truth ).



 
From: David Jones 
Sent: Saturday, July 10, 2010 3:51 PM
To: agi 
Subject: Re: [agi] Re: Huge Progress on the Core of AGI


Mike,

Using the image itself as a template to match is possible. In fact it has been 
done before. But it suffers from several problems that would also need solving. 

1) Images are 2D. I assume you are also referring to 2D outlines. Real objects 
are 3D. So, you're going to have to infer the shape of the object... which 
means you are no longer actually transforming the image itself. You are 
transforming a model of the image, which would have points, curves, dimensions, 
etc. Basically, a mathematical shape :) No matter how much you disapprove of 
encoding info, sometimes it makes sense to do it.
2) Creating the first outline and figuring out what to outline is not trivial 
at all. So, this method can only be used after you can do that. There is a lot 
more uncertainty involved here than you seem to realize. First, how do you even 
determine the outline? That is an unsolved problem. So, not only will you have 
to try many transformations with the right outline, you have to try many with 
wrong outlines, increase the possibilities (exponentially?). It looks like you 
need a way to score possibilities and decide which ones to try. 
3) rock is a word and words are always learned by induction along with other 
types of reasoning before we can even consider it a type of object. So, you are 
starting with a somewhat unrepresentative or artificial problem. 
4) Even the same rock can look very different from different perspectives. In 
fact, how do you even match the same rock? Please describe how your system 
would do this. It is not trivial at all. And you will soon see that there is an 
extremely large amount of uncertainty. Dealing with this type of uncertainty is 
the central problem of AGI. The central problem is not fluid schemas.Even if I 
used this method, I would be stuck with the same exact uncertainty problems. 
So, you're not going to get passed them like this. The same research on 
explanatory and non-monotonic type reasoning must still be done.
5) what is a fluid transform? You can't just throw out words. Please define it. 
You are going to realize that your representation is pretty much geometric 
anyway. Regardless, it will likely be equivalent. Are you going to try every 
possible transformation? Nope. That would be impossible. So, how do you decide 
what transformations to try? When is a transformation too large of a change to 
consider it the same rock? When is it too large to consider it a different 
rock? 
6) Are you seriously going to transform every object you've every tried to 
outline? This is going to be prohibitively costly in terms of processing. 
Especially because you haven't defined how you're going to decide what to 
transform and what not to. So, before you can even use this algorithm, you're 
going to have to use something else to decide what is a possible candidate and 
what is not.



On Fri, Jul 9, 2010 at 6:42 PM, Mike Tintner tint...@blueyonder.co.uk wrote: 
  Now let's see **you** answer a question. Tell me how any 
algorithmic/mathematical approach of any kind actual or in pure principle can 
be applied to recognize raindrops falling down a pane - and to predict 
their movement?

Like I've said many times before, we can't predict everything, and we certainly 
shouldn't try. But  


  http://www.pond5.com/stock-footage/263778/beautiful-rain-drops.html

  or to recognize a rock?

  http://www.handprint.com/HP/WCL/IMG/LPR/adams.jpg

  or a [filled] shopping bag?

  http://www.abc.net.au/reslib/200801/r215609_837743.jpg
  
http://www.sustainableisgood.com/photos/uncategorized/2007/03/29/shoppingbags.jpg
  http://thegogreenblog.com/wp-content/uploads/2007/12/plastic_shopping_bag.jpg

  or if you want a real killer, google some vid clips of amoebas in oozing 
motion?

  PS In every case, I suggest, the brain observes different principles of 
transformation - for the most part unconsciously. And they can be of various 
kinds not just direct natural transformations, of course. It's possible, it 
occurs to me, that the principle that applies to rocks might just be something 
like whatever can be carved out of stone

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-10 Thread David Jones
 or reason.

 Here is a graphic demonstration of what you're trying to claim - in effect,
 you're saying

 geometry can define 'a piece of plasticine'  [and by extension any
 standard transformation of a piece of plasticine as in a playroom]

 That's a nonsense. A piece of plasticine is a **freeform** object - it can
 be transformed into an unlimited diversity of shapes/forms (albeit with
 constraints).

 Formulae - the formulae of geometry - can only define **set form** objects,
 with a precise form and structure. There are no exceptions. Black is not
 white.  Homogeneous is not heterogeneous. Set form is not freeform.

 All the objects I list - all irregular objects - are freeform objects.

 You are ironically misunderstanding the very foundations and rationale of
 geometry. Geometry - with its set form forms - was invented precisely
 because mathematicians didn't like the freeform nature of the world - wanted
 to create set forms (in the footsteps of the rational technologists who
 preceded them) - that they could control and reduce to formulae and
 reproduce with ease. Freeform rocks are a lot more complex to draw and make
 and reproduce than  set form rectangular bricks.

 Set forms are not free forms. They are the opposite.

 (And while you and others will continue to *claim*  in theory
 absolute setform=freeform nonsense, you will in practice always, always
 stick to setform objects. Some part of you knows the v.obvious truth ).




  *From:* David Jones davidher...@gmail.com
 *Sent:* Saturday, July 10, 2010 3:51 PM
 *To:* agi agi@v2.listbox.com
 *Subject:* Re: [agi] Re: Huge Progress on the Core of AGI

 Mike,

 Using the image itself as a template to match is possible. In fact it has
 been done before. But it suffers from several problems that would also need
 solving.

 1) Images are 2D. I assume you are also referring to 2D outlines. Real
 objects are 3D. So, you're going to have to infer the shape of the object...
 which means you are no longer actually transforming the image itself. You
 are transforming a model of the image, which would have points, curves,
 dimensions, etc. Basically, a mathematical shape :) No matter how much you
 disapprove of encoding info, sometimes it makes sense to do it.
 2) Creating the first outline and figuring out what to outline is not
 trivial at all. So, this method can only be used after you can do that.
 There is a lot more uncertainty involved here than you seem to realize.
 First, how do you even determine the outline? That is an unsolved problem.
 So, not only will you have to try many transformations with the right
 outline, you have to try many with wrong outlines, increase the
 possibilities (exponentially?). It looks like you need a way to score
 possibilities and decide which ones to try.
 3) rock is a word and words are always learned by induction along with
 other types of reasoning before we can even consider it a type of object.
 So, you are starting with a somewhat unrepresentative or artificial problem.

 4) Even the same rock can look very different from different perspectives.
 In fact, how do you even match the same rock? Please describe how your
 system would do this. It is not trivial at all. And you will soon see that
 there is an extremely large amount of uncertainty. Dealing with this type of
 uncertainty is the central problem of AGI. The central problem is not fluid
 schemas.Even if I used this method, I would be stuck with the same exact
 uncertainty problems. So, you're not going to get passed them like this. The
 same research on explanatory and non-monotonic type reasoning must still be
 done.
 5) what is a fluid transform? You can't just throw out words. Please define
 it. You are going to realize that your representation is pretty much
 geometric anyway. Regardless, it will likely be equivalent. Are you going to
 try every possible transformation? Nope. That would be impossible. So, how
 do you decide what transformations to try? When is a transformation too
 large of a change to consider it the same rock? When is it too large to
 consider it a different rock?
 6) Are you seriously going to transform every object you've every tried
 to outline? This is going to be prohibitively costly in terms of processing.
 Especially because you haven't defined how you're going to decide what to
 transform and what not to. So, before you can even use this algorithm,
 you're going to have to use something else to decide what is a possible
 candidate and what is not.


 On Fri, Jul 9, 2010 at 6:42 PM, Mike Tintner tint...@blueyonder.co.ukwrote:

  Now let's see **you** answer a question. Tell me how any
 algorithmic/mathematical approach of any kind actual or in pure principle
 can be applied to recognize raindrops falling down a pane - and to
 predict their movement?


 Like I've said many times before, we can't predict everything, and we
 certainly shouldn't try. But


 http://www.pond5.com/stock-footage/263778/beautiful-rain-drops.html

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-10 Thread Mike Tintner
Dave:You can't solve the problems with your approach either

This is based on knowledge of what examples? Zero?

I have given you one instance of s.o. [a technologist not a philosopher like 
me] who is if only in broad principle, trying to proceed in a non-encoding, 
analog-comparison direction. There must be others who are however crudely 
trying and considering what can be broadly classified as analog approaches. How 
much do you know, or have you even thought about such approaches? [Of course, 
computing doesn't have to be either/or analog-digital but can be both]

My point 6) BTW is irrefutable, completely irrefutable, and puts a finger bang 
on why geometry  obviously cannot cope with real objects,  ( although I can and 
must, do a much more extensive job of exposition).




From: David Jones 
Sent: Saturday, July 10, 2010 5:44 PM
To: agi 
Subject: Re: [agi] Re: Huge Progress on the Core of AGI


Mike, 

Your claim that you have to reject encoded and simpler descriptions of the 
world to solve AGI is unfounded. You can't solve the problems with your 
approach either. So, this is argument is going no where. You won't admit that 
you're faced with the same problems no matter how you approach it. I do admit 
that your ideas on transformations can be useful, but not at all by themselves 
and definitely not in the absense of math or geometry. They also are certainly 
not a solution to any of the problems I'm considering. Regardless, we both face 
the same problems of uncertainty and encoding.

Dave


On Sat, Jul 10, 2010 at 12:09 PM, Mike Tintner tint...@blueyonder.co.uk wrote:

  General point: you keep talking as if algorithms *work* for visual AGI - they 
don't - they simply haven't. Unless you take a set of objects carefully chosen 
to be closely aligned and close in overall form- and then it's not AGI. But in 
general the algorithmic patterned approach has been a bust - because natural 
objects as well as clusters of diverse artificial objects are not patterned. 
You can see this. It's actually obvious if you care to look.

  Re 2) It may well be that you've gotta have a body to move around to 
different POV's for objects, and to touch those objects and use another sense 
or two to determine the outlines. I haven't thought all this through at all, 
but you've got to realise that the whole of evolution tells you that sensing 
the world is a *multi*-*common*-sense affair, and not a single one. You're 
trying to isolate a sense - and insisting that that's the only way things can 
be done, even while you along with others are continuously failing.  Respect 
and learn from evolution.

  Re 1) I again haven't thought this through, but it sounds like you're again 
assuming that your AGI vision must automatically meet adult, educated criteria. 
Presumably it takes time to perceive and appreciate the 3-d ness of objects.And 
3-d is a mathematical, highly evolved idea. Yes, objects are solid, but they 
were never 3-d until geometry was invented a mere 2,000 or so years ago. 
Primitive people see very differently from modern people. Read McLuhan on this 
(v. worthwhile generally for s.o. like you).

  And no, rocks are simply *not* mathematical objects. There are no rocks in 
geometry period. *You* can use a mathematically-based program to draw a rock, 
but that's down to your AGI mind, not the mathematics.

  [Look BTW how you approach all these things - you always start mathematically 
- but it is a simple fact that maths. was invented only a few thousand years 
ago, animals and humans happily existed and saw the world without it, and maths 
objects are **abstract fictions** - they do not exist in the real world, as 
maths itself will tell you - and you have to be able to *see* that - to see and 
know that there is a diff. between a postulated math square and any concrete, 
real object version of a square. What visual processing are you going to use to 
tell the difference between a math and a real object? Are you saying you can 
use maths to do that?

  Non-sense.

  3) I am starting with simple natural irregular objects. I can recognize that 
rocks may have too large a range of irregularity for first visual perception. 
(It'd be v.g. to know how soon infants recognize them). Maybe then you need 
something with a narrower range like shopping bags. I'd again study the 
development of infant perception - that will give you the best ideas re what to 
start with.

  But what's vital is that your objects be natural and irregular, not narrow AI 
formulaic squares.

  5) A fluid transform is er a fluid transform. What are all the ways a 
raindrop as per the vid can transform into a different form - all the ways that 
the outline of the drop can continuously reshape. Jeez they're pretty well 
infinite, except that they're constrained. The drop isn't suddenly going to 
become a square or rectilinear. And you can presumably invent new lines/fields 
of transformation wh. could turn out to be true.

  But if you think

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-10 Thread David Jones
On Sat, Jul 10, 2010 at 5:02 PM, Mike Tintner tint...@blueyonder.co.ukwrote:

  Dave:You can't solve the problems with your approach either

 This is based on knowledge of what examples? Zero?


It is based on the fact that you have refused to show how you deal with
uncertainty. You haven't even conceded that there is uncertainty. I know for
a fact that your method cannot solve the uncertainty, because it doesn't
even consider that there might be any uncertainty. It is not a solution to
anything. It is a mere suggestion of a way to compare objects. It isn't even
a way to match them! So, when you're done comparing, your method only says
it is different by this much. Well, what the hell does that do for you?
Nothing at all. So, clearly my statement that your approach doesn't solve
anything is well based. Yet, your claim that my approach is wrong is very
poorly based. Your main disagreement is my simplification of the problem.
That doesn't mean anything. I can go back and forth between the simple
version and the more complex version whenever I want to after I've gained
understanding through experiments on the simpler version. There is nothing
wrong with the approach I am taking. It is completely necessary to study the
nature of the problems and the principles that can solve the problems.


 I have given you one instance of s.o. [a technologist not a philosopher
 like me] who is if only in broad principle, trying to proceed in
 a non-encoding, analog-comparison direction. There must be others who are
 however crudely trying and considering what can be broadly classified as
 analog approaches. How much do you know, or have you even thought about such
 approaches? [Of course, computing doesn't have to be either/or
 analog-digital but can be both]


the approaches are equivalent. I don't even say that my approach is digital.
If I find a reason to use an analog approach, I'll use it. But so far, I
can't find any reason to do so. BTW, you would be wiser to realize that
analog can likely be well represented by digital encoding for the problems
we are discussing. I see absolutely no reason to think analog is better than
digital for any of these problems. You simply have a bias against my
approach. And bias is not sufficient reason to disagree with me.


 My point 6) BTW is irrefutable, completely irrefutable, and puts a finger
 bang on why geometry  obviously cannot cope with real objects,  ( although I
 can and must, do a much more extensive job of exposition).


That is ridiculous. First of all, a plastic bag can easily be represented
geometrically as a mesh with length constraints and connectivity
constraints. Of course it doesn't represent every possible transformation of
the bag. It doesn't even make sense to store such a representation. In fact,
its not possible. Your claim that geometry can't represent a plastic bag is
downright dumb and trivially refutable. You could easily use your own ideas
then to transform the mesh for matching, although I still claim this is
not the right way to always match objects. In fact, I would dare say it is
often the wrong way to match objects because of the processing and time
cost.



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-10 Thread Abram Demski
David,

Sorry for the slow response.

I agree completely about expectations vs predictions, though I wouldn't use
that terminology to make the distinction (since the two terms are
near-synonyms in English, and I'm not aware of any technical definitions
that are common in the literature). This is why I think probability theory
is necessary: to formalize this idea of expectations.

I also agree that it's good to utilize previous knowledge. However, I think
existing AI research has tackled this over and over; learning that knowledge
is the bigger problem.

--Abram

On Thu, Jul 8, 2010 at 6:32 PM, David Jones davidher...@gmail.com wrote:

 Abram,

 Yeah, I would have to object for a couple reasons.

 First, prediction requires previous knowledge. So, even if you make that
 your primary goal, you're still going to have my research goals as the
 prerequisite: which are to process visual information in a more general way
 and learn about the environment in a more general way.

 Second, not everything is predictable. Certainly, we should not try to
 predict everything. Only after we have experience, can we actually predict
 anything. Even then, it's not precise prediction, like predicting the next
 frame of a video. It's more like having knowledge of what is quite likely to
 occur, or maybe an approximate prediction, but not guaranteed in the least.
 For example, based on previous experience, striking a match will light it.
 But, sometimes it doesn't light, and that too is expected to occur
 sometimes. We definitely don't predict the next image we'll see when it
 lights though. We just have expectations for what we might see and this
 helps us interpret the image effectively. We should try to expect certain
 outcomes or possible outcomes though. You could call that prediction, but
 it's not quite the same. The things we are more likely to see should be
 attempted as an explanation first and preferred if not given a reason to
 think otherwise.


 Dave


 On Thu, Jul 8, 2010 at 5:51 PM, Abram Demski abramdem...@gmail.comwrote:

 David,

 How I'd present the problem would be predict the next frame, or more
 generally predict a specified portion of video given a different portion. Do
 you object to this approach?

 --Abram

 On Thu, Jul 8, 2010 at 5:30 PM, David Jones davidher...@gmail.comwrote:

 It may not be possible to create a learning algorithm that can learn how
 to generally process images and other general AGI problems. This is for the
 same reason that completely general vision algorithms are likely impossible.
 I think that figuring out how to process sensory information intelligently
 requires either 1) impossible amounts of processing or 2) intelligent design
 and understanding by us.

 Maybe you could be more specific about how general learning algorithms
 would solve problems such as the one I'm tackling. But, I am extremely
 doubtful it can be done because the problems cannot be effectively described
 to such an algorithm. If you can't describe the problem, it can't search for
 solutions. If it can't search for solutions, you're basically stuck with
 evolution type algorithms, which require prohibitory amounts of processing.

 The reason that vision is so important for learning is that sensory
 perception is the foundation required to learn everything else. If you don't
 start with a foundational problem like this, you won't be representing the
 real nature of general intelligence problems that require extensive
 knowledge of the world to solve properly. Sensory perception is required to
 learn the information needed to understand everything else. Text and
 language for example, require extensive knowledge about the world to
 understand and especially to learn about. If you start with general learning
 algorithms on these unrepresentative problems, you will get stuck as we
 already have.

 So, it still makes a lot of sense to start with a concrete problem that
 does not require extensive amounts of previous knowledge to start learning.
 In fact, AGI requires that you not pre-program the AI with such extensive
 knowledge. So, lots of people are working on general learning algorithms
 that are unrepresentative of what is required for AGI because the algorithms
 don't have the knowledge needed to learn what they are trying to learn
 about. Regardless of how you look at it, my approach is definitely the right
 approach to AGI in my opinion.



 On Thu, Jul 8, 2010 at 5:02 PM, Abram Demski abramdem...@gmail.comwrote:

 David,

 That's why, imho, the rules need to be *learned* (and, when need be,
 unlearned). IE, what we need to work on is general learning algorithms, not
 general visual processing algorithms.

 As you say, there's not even such a thing as a general visual processing
 algorithm. Learning algorithms suffer similar environment-dependence, but
 (by their nature) not as severe...

 --Abram

 On Thu, Jul 8, 2010 at 3:17 PM, David Jones davidher...@gmail.comwrote:

 I've learned something really 

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-09 Thread David Jones
Mike,

On Thu, Jul 8, 2010 at 6:52 PM, Mike Tintner tint...@blueyonder.co.ukwrote:

  Isn't the first problem simply to differentiate the objects in a scene?


Well, that is part of the movement problem. If you say something moved, you
are also saying that the objects in the two or more video frames are the
same instance.


 (Maybe the most important movement to begin with is not  the movement of
 the object, but of the viewer changing their POV if only slightly  - wh.
 won't be a factor if you're looking at a screen)


Maybe, but this problem becomes kind of trivial in a 2D environment,
assuming you don't allow rotation of the POV. Moving the POV would simply
translate all the objects linearly. If you make it a 3D environment, it
becomes significantly more complicated. I could work on 3D, which I will,
but I'm not sure I should start there. I probably should consider it though
and see what complications it adds to the problem and how they might be
solved.


 And that I presume comes down to being able to put a crude, highly
 tentative, and fluid outline round them (something that won't be neces. if
 you're dealing with squares?) . Without knowing v. little if anything about
 what kind of objects they are. As an infant most likely does. {See infants'
 drawings and how they evolve v. gradually from a v. crude outline blob that
 at first can represent anything - that I'm suggesting is a replay of how
 visual perception developed).


 The fluid outline or image schema is arguably the basis of all intelligence
 - just about everything AGI is based on it.  You need an outline for
 instance not just of objects, but of where you're going, and what you're
 going to try and do - if you want to survive in the real world.  Schemas
 connect everything AGI.

 And it's not a matter of choice - first you have to have an outline/sense
 of the whole - whatever it is -  before you can start filling in the parts.



Well, this is the question. The solution is underdetermined, which means
that a right solution is not possible to know with complete certainty. So,
you may take the approach of using contours to match objects, but that is
certainly not the only way to approach the problem. Yes, you have to use
local features in the image to group pixels together in some way. I agree
with you there.

Is using contours the right way? Maybe, but not by itself. You have to
define the problem a little better than just saying that we need to
construct an outline. The real problem/question is this: How do you
determine the uncertainty of a hypothesis, lower it and also determine how
good a hypothesis is, especially in comparison to other hypotheses?

So, in this case, we are trying to use an outline comparison to determine
the best match hypotheses between objects. But, that doesn't define how you
score alternative hypotheses. That also is certainly not the only way to do
it. You could use the details within the outline too. In fact, in some
situations, this would be required to disambiguate between the possible
hypotheses.


 P.S. It would be mindblowingly foolish BTW to think you can do better
 than the way an infant learns to see - that's an awfully big visual section
 of the brain there, and it works.


I'm not trying to do better than the human brain. I am trying to solve the
same problems that the brain solves in a different way, sometimes better
than the brain, sometimes worse, sometimes equivalently. What would be
foolish is to assume the only way to duplicate general intelligence is to
copy the human brain. By taking this approach, you are forced to reverse
engineer and understand something that is extremely difficult to reverse
engineer. In addition, a solution that using the brain's design may not be
economically feasible. So, approaching the problem by copying the human
brain has additional risks. You may end up figuring out how the brain works
and not be able to use it. In addition might not end up with a good
understanding of what other solutions might be possible.

Dave



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-09 Thread Mike Tintner
Couple of quick comments (I'm still thinking about all this  - but I'm 
confident everything AGI links up here).

A fluid schema is arguably by its v. nature a method - a trial and error, 
arguably universal method. It links vision to the hand or any effector. 
Handling objects also is based on fluid schemas - you put out a fluid 
adjustably-shaped hand to grasp things. And even if you don't have hands, like 
a worm, and must grasp things with your body, and must grasp the ground under 
which you move, then too you must use fluid body schemas/maps.

All concepts - the basis of language and before language, all intelligence - 
are also almost certainly fluid schemas (and not as you suggested, patterns).

All creative problemsolving begins from concepts of what you want to do  (and 
not formulae or algorithms as in rational problemsolving). Any suggestion to 
the contrary will not, I suggest, bear the slightest serious examination.

**Fluid schemas/concepts/fluid outlines are attempts-to-grasp-things - 
gropings.** 

Point 2 : I'd relook at your assumptions in all your musings  - my impression 
is they all assume, unwittingly, an *adult* POV - the view of s.o. who already 
knows how to see - as distinct from an infant who is just learning to see and 
get to grips with an extremely blurred world, (even more blurred and 
confusing, I wouldn't be surprised, than that Prakash video). You're 
unwittingly employing top down, fully-formed-intelligence assumptions even 
while overtly trying to produce a learning system - you're looking for what an 
adult wants to know, rather than what an infant 
starting-from-almost-no-knowledge-of-the-world wants to know.

If you accept the point in any way, major philosophical rethinking is required.



From: David Jones 
Sent: Friday, July 09, 2010 1:56 PM
To: agi 
Subject: Re: [agi] Re: Huge Progress on the Core of AGI


Mike,


On Thu, Jul 8, 2010 at 6:52 PM, Mike Tintner tint...@blueyonder.co.uk wrote:

  Isn't the first problem simply to differentiate the objects in a scene? 

Well, that is part of the movement problem. If you say something moved, you are 
also saying that the objects in the two or more video frames are the same 
instance.
 
  (Maybe the most important movement to begin with is not  the movement of the 
object, but of the viewer changing their POV if only slightly  - wh. won't be a 
factor if you're looking at a screen)

Maybe, but this problem becomes kind of trivial in a 2D environment, assuming 
you don't allow rotation of the POV. Moving the POV would simply translate all 
the objects linearly. If you make it a 3D environment, it becomes significantly 
more complicated. I could work on 3D, which I will, but I'm not sure I should 
start there. I probably should consider it though and see what complications it 
adds to the problem and how they might be solved.
 
  And that I presume comes down to being able to put a crude, highly tentative, 
and fluid outline round them (something that won't be neces. if you're dealing 
with squares?) . Without knowing v. little if anything about what kind of 
objects they are. As an infant most likely does. {See infants' drawings and how 
they evolve v. gradually from a v. crude outline blob that at first can 
represent anything - that I'm suggesting is a replay of how visual perception 
developed).

  The fluid outline or image schema is arguably the basis of all intelligence - 
just about everything AGI is based on it.  You need an outline for instance not 
just of objects, but of where you're going, and what you're going to try and do 
- if you want to survive in the real world.  Schemas connect everything AGI.

  And it's not a matter of choice - first you have to have an outline/sense of 
the whole - whatever it is -  before you can start filling in the parts.


Well, this is the question. The solution is underdetermined, which means that a 
right solution is not possible to know with complete certainty. So, you may 
take the approach of using contours to match objects, but that is certainly not 
the only way to approach the problem. Yes, you have to use local features in 
the image to group pixels together in some way. I agree with you there.  

Is using contours the right way? Maybe, but not by itself. You have to define 
the problem a little better than just saying that we need to construct an 
outline. The real problem/question is this: How do you determine the 
uncertainty of a hypothesis, lower it and also determine how good a hypothesis 
is, especially in comparison to other hypotheses? 

So, in this case, we are trying to use an outline comparison to determine the 
best match hypotheses between objects. But, that doesn't define how you score 
alternative hypotheses. That also is certainly not the only way to do it. You 
could use the details within the outline too. In fact, in some situations, this 
would be required to disambiguate between the possible hypotheses.  



  P.S. It would

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-09 Thread David Jones
, 2010 1:56 PM
 *To:* agi agi@v2.listbox.com
 *Subject:* Re: [agi] Re: Huge Progress on the Core of AGI

 Mike,

 On Thu, Jul 8, 2010 at 6:52 PM, Mike Tintner tint...@blueyonder.co.ukwrote:

  Isn't the first problem simply to differentiate the objects in a scene?


 Well, that is part of the movement problem. If you say something moved, you
 are also saying that the objects in the two or more video frames are the
 same instance.


  (Maybe the most important movement to begin with is not  the movement of
 the object, but of the viewer changing their POV if only slightly  - wh.
 won't be a factor if you're looking at a screen)


 Maybe, but this problem becomes kind of trivial in a 2D environment,
 assuming you don't allow rotation of the POV. Moving the POV would simply
 translate all the objects linearly. If you make it a 3D environment, it
 becomes significantly more complicated. I could work on 3D, which I will,
 but I'm not sure I should start there. I probably should consider it though
 and see what complications it adds to the problem and how they might be
 solved.


  And that I presume comes down to being able to put a crude, highly
 tentative, and fluid outline round them (something that won't be neces. if
 you're dealing with squares?) . Without knowing v. little if anything about
 what kind of objects they are. As an infant most likely does. {See infants'
 drawings and how they evolve v. gradually from a v. crude outline blob that
 at first can represent anything - that I'm suggesting is a replay of how
 visual perception developed).


 The fluid outline or image schema is arguably the basis of all
 intelligence - just about everything AGI is based on it.  You need an
 outline for instance not just of objects, but of where you're going, and
 what you're going to try and do - if you want to survive in the real world.
 Schemas connect everything AGI.

 And it's not a matter of choice - first you have to have an outline/sense
 of the whole - whatever it is -  before you can start filling in the parts.



 Well, this is the question. The solution is underdetermined, which means
 that a right solution is not possible to know with complete certainty. So,
 you may take the approach of using contours to match objects, but that is
 certainly not the only way to approach the problem. Yes, you have to use
 local features in the image to group pixels together in some way. I agree
 with you there.

 Is using contours the right way? Maybe, but not by itself. You have to
 define the problem a little better than just saying that we need to
 construct an outline. The real problem/question is this: How do you
 determine the uncertainty of a hypothesis, lower it and also determine how
 good a hypothesis is, especially in comparison to other hypotheses?

 So, in this case, we are trying to use an outline comparison to determine
 the best match hypotheses between objects. But, that doesn't define how you
 score alternative hypotheses. That also is certainly not the only way to do
 it. You could use the details within the outline too. In fact, in some
 situations, this would be required to disambiguate between the possible
 hypotheses.


 P.S. It would be mindblowingly foolish BTW to think you can do better
 than the way an infant learns to see - that's an awfully big visual section
 of the brain there, and it works.


 I'm not trying to do better than the human brain. I am trying to solve
 the same problems that the brain solves in a different way, sometimes better
 than the brain, sometimes worse, sometimes equivalently. What would be
 foolish is to assume the only way to duplicate general intelligence is to
 copy the human brain. By taking this approach, you are forced to reverse
 engineer and understand something that is extremely difficult to reverse
 engineer. In addition, a solution that using the brain's design may not be
 economically feasible. So, approaching the problem by copying the human
 brain has additional risks. You may end up figuring out how the brain works
 and not be able to use it. In addition might not end up with a good
 understanding of what other solutions might be possible.

 Dave
   *agi* | Archives https://www.listbox.com/member/archive/303/=now
 https://www.listbox.com/member/archive/rss/303/ | 
 Modifyhttps://www.listbox.com/member/?;Your Subscription
 http://www.listbox.com
*agi* | Archives https://www.listbox.com/member/archive/303/=now
 https://www.listbox.com/member/archive/rss/303/ | 
 Modifyhttps://www.listbox.com/member/?;Your Subscription
 http://www.listbox.com




---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-09 Thread Mike Tintner
If fluid schemas - speaking broadly - are what is needed, (and I'm pretty sure 
they are), it's n.g. trying for something else. You can't substitute a square 
approach for a fluid amoeba outline approach. (And you will certainly need 
exactly such an approach to recognize amoeba's).

If it requires a new kind of machine, or a radically new kind of instruction 
set for computers, then that's what it requires - Stan Franklin, BTW, is one 
person who does recognize, and is trying to deal with this problem - might be 
worth checking up on him.

This is partly BTW why my instinct is that it may be better to start with tasks 
for robot hands*, because it should be possible to get them to apply a 
relatively flexible and fluid grip/handshape and grope for and experiment with 
differently shaped objects And if you accept the broad philosophy I've been 
outlining, then it does make sense that evolution should have started with 
touch as a more primary sense, well before it got to vision. 

*Or perhaps it may prove better to start with robot snakes/bodies or somesuch.


From: David Jones 
Sent: Friday, July 09, 2010 3:22 PM
To: agi 
Subject: Re: [agi] Re: Huge Progress on the Core of AGI





On Fri, Jul 9, 2010 at 10:04 AM, Mike Tintner tint...@blueyonder.co.uk wrote:

  Couple of quick comments (I'm still thinking about all this  - but I'm 
confident everything AGI links up here).

  A fluid schema is arguably by its v. nature a method - a trial and error, 
arguably universal method. It links vision to the hand or any effector. 
Handling objects also is based on fluid schemas - you put out a fluid 
adjustably-shaped hand to grasp things. And even if you don't have hands, like 
a worm, and must grasp things with your body, and must grasp the ground under 
which you move, then too you must use fluid body schemas/maps.

  All concepts - the basis of language and before language, all intelligence - 
are also almost certainly fluid schemas (and not as you suggested, patterns).

fluid schemas is not an actual algorithm. It is not clear how to go about 
implementing such a design. Even so, when you get into the details of actually 
implementing it, you will find yourself faced with the exact same problems I'm 
trying to solve. So, lets say you take the first frame and generate an initial 
fluid schema. What if an object disappears? What if the object changes? What 
if the object moves a little or a lot? What if a large number of changes occur 
at once, like one new thing suddenly blocking a bunch of similar stuff that is 
behind it? How far does your fluid schema have to be distorted for the 
algorithm to realize that it needs a new schema and can't use the same old one? 
You can't just say that all objects are always present and just distort the 
schema. What if two similar objects appear or both move and one disappears? How 
does your schema handle this? Regardless of whether you talk about hypotheses 
or schemas, it is the SAME problem. You can't avoid the fact that the whole 
thing is underdetermined and you need a way to score and compare hypotheses. 

If you disagree, please define your schema algorithm a bit more specifically. 
Then we would be able to analyze its pros and cons better.
 

  All creative problemsolving begins from concepts of what you want to do  (and 
not formulae or algorithms as in rational problemsolving). Any suggestion to 
the contrary will not, I suggest, bear the slightest serious examination.

Sure.  I would point out though that children do stuff just to learn in the 
beginning. A good example is our desire to play. Playing is a strategy by which 
children learn new things even though they don't have a need for those things 
yet. It motivates us to learn for the future and not for any pressing present 
needs. 

No matter how you look at it, you will need algorithms for general 
intelligence. To say otherwise makes zero sense. No algorithms, no design. No 
matter what design you come up with, I call that an algorithm. Algorithms don't 
have to be formulaic or narrow. Keep an open mind about the world 
algorithm, unless you can suggest a better term to describe general AI 
algorithms.



  **Fluid schemas/concepts/fluid outlines are attempts-to-grasp-things - 
gropings.** 

  Point 2 : I'd relook at your assumptions in all your musings  - my impression 
is they all assume, unwittingly, an *adult* POV - the view of s.o. who already 
knows how to see - as distinct from an infant who is just learning to see and 
get to grips with an extremely blurred world, (even more blurred and 
confusing, I wouldn't be surprised, than that Prakash video). You're 
unwittingly employing top down, fully-formed-intelligence assumptions even 
while overtly trying to produce a learning system - you're looking for what an 
adult wants to know, rather than what an infant 
starting-from-almost-no-knowledge-of-the-world wants to know.

  If you accept the point in any way, major philosophical rethinking

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-09 Thread David Jones
Mike,

Please outline your algorithm for fluid schemas though. It will be clear
when you do that you are faced with the exact same uncertainty problems I am
dealing with and trying to solve. The problems are completely equivalent.
Yours is just a specific approach that is not sufficiently defined.

You have to define how you deal with uncertainty when using fluid schemas or
even how to approach the task of figuring it out. Until then, its not a
solution to anything.

Dave

On Fri, Jul 9, 2010 at 10:59 AM, Mike Tintner tint...@blueyonder.co.ukwrote:

  If fluid schemas - speaking broadly - are what is needed, (and I'm pretty
 sure they are), it's n.g. trying for something else. You can't substitute a
 square approach for a fluid amoeba outline approach. (And you will
 certainly need exactly such an approach to recognize amoeba's).

 If it requires a new kind of machine, or a radically new kind of
 instruction set for computers, then that's what it requires - Stan Franklin,
 BTW, is one person who does recognize, and is trying to deal with this
 problem - might be worth checking up on him.

 This is partly BTW why my instinct is that it may be better to start with
 tasks for robot hands*, because it should be possible to get them to apply
 a relatively flexible and fluid grip/handshape and grope for and experiment
 with differently shaped objects And if you accept the broad philosophy I've
 been outlining, then it does make sense that evolution should have started
 with touch as a more primary sense, well before it got to vision.

 *Or perhaps it may prove better to start with robot snakes/bodies or
 somesuch.

  *From:* David Jones davidher...@gmail.com
 *Sent:* Friday, July 09, 2010 3:22 PM
   *To:* agi agi@v2.listbox.com
 *Subject:* Re: [agi] Re: Huge Progress on the Core of AGI



 On Fri, Jul 9, 2010 at 10:04 AM, Mike Tintner tint...@blueyonder.co.ukwrote:

  Couple of quick comments (I'm still thinking about all this  - but I'm
 confident everything AGI links up here).

 A fluid schema is arguably by its v. nature a method - a trial and error,
 arguably universal method. It links vision to the hand or any effector.
 Handling objects also is based on fluid schemas - you put out a fluid
 adjustably-shaped hand to grasp things. And even if you don't have hands,
 like a worm, and must grasp things with your body, and must grasp the
 ground under which you move, then too you must use fluid body schemas/maps.

 All concepts - the basis of language and before language, all intelligence
 - are also almost certainly fluid schemas (and not as you suggested,
 patterns).


 fluid schemas is not an actual algorithm. It is not clear how to go about
 implementing such a design. Even so, when you get into the details of
 actually implementing it, you will find yourself faced with the exact same
 problems I'm trying to solve. So, lets say you take the first frame and
 generate an initial fluid schema. What if an object disappears? What if
 the object changes? What if the object moves a little or a lot? What if a
 large number of changes occur at once, like one new thing suddenly blocking
 a bunch of similar stuff that is behind it? How far does your fluid schema
 have to be distorted for the algorithm to realize that it needs a new schema
 and can't use the same old one? You can't just say that all objects are
 always present and just distort the schema. What if two similar objects
 appear or both move and one disappears? How does your schema handle this?
 Regardless of whether you talk about hypotheses or schemas, it is the SAME
 problem. You can't avoid the fact that the whole thing is underdetermined
 and you need a way to score and compare hypotheses.

 If you disagree, please define your schema algorithm a bit more
 specifically. Then we would be able to analyze its pros and cons better.



 All creative problemsolving begins from concepts of what you want to do
  (and not formulae or algorithms as in rational problemsolving). Any
 suggestion to the contrary will not, I suggest, bear the slightest serious
 examination.


 Sure.  I would point out though that children do stuff just to learn in the
 beginning. A good example is our desire to play. Playing is a strategy by
 which children learn new things even though they don't have a need for those
 things yet. It motivates us to learn for the future and not for any pressing
 present needs.

 No matter how you look at it, you will need algorithms for general
 intelligence. To say otherwise makes zero sense. No algorithms, no design.
 No matter what design you come up with, I call that an algorithm. Algorithms
 don't have to be formulaic or narrow. Keep an open mind about the world
 algorithm, unless you can suggest a better term to describe general AI
 algorithms.


 **Fluid schemas/concepts/fluid outlines are attempts-to-grasp-things -
 gropings.**

 Point 2 : I'd relook at your assumptions in all your musings  - my
 impression is they all assume, unwittingly

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-09 Thread Mike Tintner
There isn't an algorithm. It's basically a matter of overlaying shapes to see 
if they fit -  much as you put one hand against another to see if they fit - 
much as you can overlay a hand to see if it fits and is capable of grasping an 
object - except considerably more fluid/ rougher. There has to be some 
instruction generating the process, but it's not an algorithm. How can you have 
an algorithm for recognizing amoebas - or rocks or a drop of water? They are 
not patterned entities - or by extension reducible to algorithms. You don't 
need to think too much about internal visual processes - you can just look,at 
the external objects-to-be-classified , the objects that make up this world, 
and see this. Just as you can look at a set of diverse patterns and see that 
they too are not reducible to any single formula/pattern/algorithm. We're 
talking about the fundamental structure of the universe and its contents.  If 
this is right and God is an artist before he is a mathematician, then it 
won't do any good screaming about it, you're going to have to invent a way  to 
do art, so to speak, on computers . Or you can pretend that dealing with 
mathematical squares will somehow help here - but it hasn't and won't.

Do you think that a creative process like creating 

http://www.apocalyptic-theories.com/gallery/lastjudge/bosch.jpg

started with an algorithm?  There are other ways of solving problems than 
algorithms - the person who created each algorithm in the first place certainly 
didn't have one. 

From: David Jones 
Sent: Friday, July 09, 2010 4:20 PM
To: agi 
Subject: Re: [agi] Re: Huge Progress on the Core of AGI


Mike, 

Please outline your algorithm for fluid schemas though. It will be clear when 
you do that you are faced with the exact same uncertainty problems I am dealing 
with and trying to solve. The problems are completely equivalent. Yours is just 
a specific approach that is not sufficiently defined.

You have to define how you deal with uncertainty when using fluid schemas or 
even how to approach the task of figuring it out. Until then, its not a 
solution to anything. 

Dave


On Fri, Jul 9, 2010 at 10:59 AM, Mike Tintner tint...@blueyonder.co.uk wrote:

  If fluid schemas - speaking broadly - are what is needed, (and I'm pretty 
sure they are), it's n.g. trying for something else. You can't substitute a 
square approach for a fluid amoeba outline approach. (And you will 
certainly need exactly such an approach to recognize amoeba's).

  If it requires a new kind of machine, or a radically new kind of instruction 
set for computers, then that's what it requires - Stan Franklin, BTW, is one 
person who does recognize, and is trying to deal with this problem - might be 
worth checking up on him.

  This is partly BTW why my instinct is that it may be better to start with 
tasks for robot hands*, because it should be possible to get them to apply a 
relatively flexible and fluid grip/handshape and grope for and experiment with 
differently shaped objects And if you accept the broad philosophy I've been 
outlining, then it does make sense that evolution should have started with 
touch as a more primary sense, well before it got to vision. 

  *Or perhaps it may prove better to start with robot snakes/bodies or somesuch.


  From: David Jones 
  Sent: Friday, July 09, 2010 3:22 PM
  To: agi 
  Subject: Re: [agi] Re: Huge Progress on the Core of AGI





  On Fri, Jul 9, 2010 at 10:04 AM, Mike Tintner tint...@blueyonder.co.uk 
wrote:

Couple of quick comments (I'm still thinking about all this  - but I'm 
confident everything AGI links up here).

A fluid schema is arguably by its v. nature a method - a trial and error, 
arguably universal method. It links vision to the hand or any effector. 
Handling objects also is based on fluid schemas - you put out a fluid 
adjustably-shaped hand to grasp things. And even if you don't have hands, like 
a worm, and must grasp things with your body, and must grasp the ground under 
which you move, then too you must use fluid body schemas/maps.

All concepts - the basis of language and before language, all intelligence 
- are also almost certainly fluid schemas (and not as you suggested, patterns).

  fluid schemas is not an actual algorithm. It is not clear how to go about 
implementing such a design. Even so, when you get into the details of actually 
implementing it, you will find yourself faced with the exact same problems I'm 
trying to solve. So, lets say you take the first frame and generate an initial 
fluid schema. What if an object disappears? What if the object changes? What 
if the object moves a little or a lot? What if a large number of changes occur 
at once, like one new thing suddenly blocking a bunch of similar stuff that is 
behind it? How far does your fluid schema have to be distorted for the 
algorithm to realize that it needs a new schema and can't use the same old one? 
You can't just say that all objects are always

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-09 Thread David Jones
The way I define algorithms encompasses just about any intelligently
designed system. So, call it what you want. I really wish you would stop
avoiding the word. But, fine. I'll play your word game...

Define your system please. And justify why or how it handles uncertainty.
You said overlay a hand to see if it fits. How do you define fits? The
truth is that it will never fit perfectly, so how do you define a good fit
and a bad one? You will find that you end up with the same exact problems I
am working on. You keep avoiding the need to define the system of fluid
schemas. You're avoiding it because it's not a solution to anything and you
can't define it without realizing that your idea doesn't pan out.

So, I dare you. Define your fluid schemas without revealing the fatal flaw
in your reasoning.

Dave
On Fri, Jul 9, 2010 at 12:05 PM, Mike Tintner tint...@blueyonder.co.ukwrote:

  There isn't an algorithm. It's basically a matter of overlaying shapes to
 see if they fit -  much as you put one hand against another to see if they
 fit - much as you can overlay a hand to see if it fits and is capable of
 grasping an object - except considerably more fluid/ rougher. There has to
 be some instruction generating the process, but it's not an algorithm. How
 can you have an algorithm for recognizing amoebas - or rocks or a drop of
 water? They are not patterned entities - or by extension reducible to
 algorithms. You don't need to think too much about internal visual processes
 - you can just look,at the external objects-to-be-classified , the objects
 that make up this world, and see this. Just as you can look at a set of
 diverse patterns and see that they too are not reducible to any single
 formula/pattern/algorithm. We're talking about the fundamental structure of
 the universe and its contents.  If this is right and God is an artist
 before he is a mathematician, then it won't do any good screaming about it,
 you're going to have to invent a way  to do art, so to speak, on computers .
 Or you can pretend that dealing with mathematical squares will somehow help
 here - but it hasn't and won't.

 Do you think that a creative process like creating

 http://www.apocalyptic-theories.com/gallery/lastjudge/bosch.jpg

 started with an algorithm?  There are other ways of solving problems than
 algorithms - the person who created each algorithm in the first place
 certainly didn't have one.

  *From:* David Jones davidher...@gmail.com
 *Sent:* Friday, July 09, 2010 4:20 PM
   *To:* agi agi@v2.listbox.com
 *Subject:* Re: [agi] Re: Huge Progress on the Core of AGI

 Mike,

 Please outline your algorithm for fluid schemas though. It will be clear
 when you do that you are faced with the exact same uncertainty problems I am
 dealing with and trying to solve. The problems are completely equivalent.
 Yours is just a specific approach that is not sufficiently defined.

 You have to define how you deal with uncertainty when using fluid schemas
 or even how to approach the task of figuring it out. Until then, its not a
 solution to anything.

 Dave

 On Fri, Jul 9, 2010 at 10:59 AM, Mike Tintner tint...@blueyonder.co.ukwrote:

  If fluid schemas - speaking broadly - are what is needed, (and I'm
 pretty sure they are), it's n.g. trying for something else. You can't
 substitute a square approach for a fluid amoeba outline approach. (And
 you will certainly need exactly such an approach to recognize amoeba's).

 If it requires a new kind of machine, or a radically new kind of
 instruction set for computers, then that's what it requires - Stan Franklin,
 BTW, is one person who does recognize, and is trying to deal with this
 problem - might be worth checking up on him.

 This is partly BTW why my instinct is that it may be better to start with
 tasks for robot hands*, because it should be possible to get them to apply
 a relatively flexible and fluid grip/handshape and grope for and experiment
 with differently shaped objects And if you accept the broad philosophy I've
 been outlining, then it does make sense that evolution should have started
 with touch as a more primary sense, well before it got to vision.

 *Or perhaps it may prove better to start with robot snakes/bodies or
 somesuch.

  *From:* David Jones davidher...@gmail.com
 *Sent:* Friday, July 09, 2010 3:22 PM
   *To:* agi agi@v2.listbox.com
 *Subject:* Re: [agi] Re: Huge Progress on the Core of AGI



 On Fri, Jul 9, 2010 at 10:04 AM, Mike Tintner 
 tint...@blueyonder.co.ukwrote:

  Couple of quick comments (I'm still thinking about all this  - but I'm
 confident everything AGI links up here).

 A fluid schema is arguably by its v. nature a method - a trial and error,
 arguably universal method. It links vision to the hand or any effector.
 Handling objects also is based on fluid schemas - you put out a fluid
 adjustably-shaped hand to grasp things. And even if you don't have hands,
 like a worm, and must grasp things with your body, and must grasp the
 ground under

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-08 Thread Abram Demski
David,

That's why, imho, the rules need to be *learned* (and, when need be,
unlearned). IE, what we need to work on is general learning algorithms, not
general visual processing algorithms.

As you say, there's not even such a thing as a general visual processing
algorithm. Learning algorithms suffer similar environment-dependence, but
(by their nature) not as severe...

--Abram

On Thu, Jul 8, 2010 at 3:17 PM, David Jones davidher...@gmail.com wrote:

 I've learned something really interesting today. I realized that general
 rules of inference probably don't really exists. There is no such thing as
 complete generality for these problems. The rules of inference that work for
 one environment would fail in alien environments.

 So, I have to modify my approach to solving these problems. As I studied
 over simplified problems, I realized that there are probably an infinite
 number of environments with their own behaviors that are not representative
 of the environments we want to put a general AI in.

 So, it is not ok to just come up with any case study and solve it. The case
 study has to actually be representative of a problem we want to solve in an
 environment we want to apply AI. Otherwise the solution required will take
 too long to develop because of it tries to accommodate too much
 generality. As I mentioned, such a general solution is likely impossible.
 So, someone could easily get stuck trying to solve an impossible task of
 creating one general solution to too many problems that don't allow a
 general solution.

 The best course is a balance between the time required to write a very
 general solution and the time required to write less general solutions for
 multiple problem types and environments. The best way to do this is to
 choose representative case studies to solve and make sure the solutions are
 truth-tropic and justified for the environments they are to be applied.

 Dave


 On Sun, Jun 27, 2010 at 1:31 AM, David Jones davidher...@gmail.comwrote:

 A method for comparing hypotheses in explanatory-based reasoning: *

 We prefer the hypothesis or explanation that ***expects* more
 observations. If both explanations expect the same observations, then the
 simpler of the two is preferred (because the unnecessary terms of the more
 complicated explanation do not add to the predictive power).*

 *Why are expected events so important?* They are a measure of 1)
 explanatory power and 2) predictive power. The more predictive and the more
 explanatory a hypothesis is, the more likely the hypothesis is when compared
 to a competing hypothesis.

 Here are two case studies I've been analyzing from sensory perception of
 simplified visual input:
 The goal of the case studies is to answer the following: How do you
 generate the most likely motion hypothesis in a way that is general and
 applicable to AGI?
 *Case Study 1)* Here is a link to an example: animated gif of two black
 squares move from left to 
 righthttp://practicalai.org/images/CaseStudy1.gif.
 *Description: *Two black squares are moving in unison from left to right
 across a white screen. In each frame the black squares shift to the right so
 that square 1 steals square 2's original position and square two moves an
 equal distance to the right.
 *Case Study 2) *Here is a link to an example: the interrupted 
 squarehttp://practicalai.org/images/CaseStudy2.gif.
 *Description:* A single square is moving from left to right. Suddenly in
 the third frame, a single black square is added in the middle of the
 expected path of the original black square. This second square just stays
 there. So, what happened? Did the square moving from left to right keep
 moving? Or did it stop and then another square suddenly appeared and moved
 from left to right?

 *Here is a simplified version of how we solve case study 1:
 *The important hypotheses to consider are:
 1) the square from frame 1 of the video that has a very close position to
 the square from frame 2 should be matched (we hypothesize that they are the
 same square and that any difference in position is motion).  So, what
 happens is that in each two frames of the video, we only match one square.
 The other square goes unmatched.
 2) We do the same thing as in hypothesis #1, but this time we also match
 the remaining squares and hypothesize motion as follows: the first square
 jumps over the second square from left to right. We hypothesize that this
 happens over and over in each frame of the video. Square 2 stops and square
 1 jumps over it over and over again.
 3) We hypothesize that both squares move to the right in unison. This is
 the correct hypothesis.

 So, why should we prefer the correct hypothesis, #3 over the other two?

 Well, first of all, #3 is correct because it has the most explanatory
 power of the three and is the simplest of the three. Simpler is better
 because, with the given evidence and information, there is no reason to
 desire a more complicated hypothesis such 

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-08 Thread David Jones
It may not be possible to create a learning algorithm that can learn how to
generally process images and other general AGI problems. This is for the
same reason that completely general vision algorithms are likely impossible.
I think that figuring out how to process sensory information intelligently
requires either 1) impossible amounts of processing or 2) intelligent design
and understanding by us.

Maybe you could be more specific about how general learning algorithms would
solve problems such as the one I'm tackling. But, I am extremely doubtful it
can be done because the problems cannot be effectively described to such an
algorithm. If you can't describe the problem, it can't search for solutions.
If it can't search for solutions, you're basically stuck with evolution type
algorithms, which require prohibitory amounts of processing.

The reason that vision is so important for learning is that sensory
perception is the foundation required to learn everything else. If you don't
start with a foundational problem like this, you won't be representing the
real nature of general intelligence problems that require extensive
knowledge of the world to solve properly. Sensory perception is required to
learn the information needed to understand everything else. Text and
language for example, require extensive knowledge about the world to
understand and especially to learn about. If you start with general learning
algorithms on these unrepresentative problems, you will get stuck as we
already have.

So, it still makes a lot of sense to start with a concrete problem that does
not require extensive amounts of previous knowledge to start learning. In
fact, AGI requires that you not pre-program the AI with such extensive
knowledge. So, lots of people are working on general learning algorithms
that are unrepresentative of what is required for AGI because the algorithms
don't have the knowledge needed to learn what they are trying to learn
about. Regardless of how you look at it, my approach is definitely the right
approach to AGI in my opinion.



On Thu, Jul 8, 2010 at 5:02 PM, Abram Demski abramdem...@gmail.com wrote:

 David,

 That's why, imho, the rules need to be *learned* (and, when need be,
 unlearned). IE, what we need to work on is general learning algorithms, not
 general visual processing algorithms.

 As you say, there's not even such a thing as a general visual processing
 algorithm. Learning algorithms suffer similar environment-dependence, but
 (by their nature) not as severe...

 --Abram

 On Thu, Jul 8, 2010 at 3:17 PM, David Jones davidher...@gmail.com wrote:

 I've learned something really interesting today. I realized that general
 rules of inference probably don't really exists. There is no such thing as
 complete generality for these problems. The rules of inference that work for
 one environment would fail in alien environments.

 So, I have to modify my approach to solving these problems. As I studied
 over simplified problems, I realized that there are probably an infinite
 number of environments with their own behaviors that are not representative
 of the environments we want to put a general AI in.

 So, it is not ok to just come up with any case study and solve it. The
 case study has to actually be representative of a problem we want to solve
 in an environment we want to apply AI. Otherwise the solution required will
 take too long to develop because of it tries to accommodate too much
 generality. As I mentioned, such a general solution is likely impossible.
 So, someone could easily get stuck trying to solve an impossible task of
 creating one general solution to too many problems that don't allow a
 general solution.

 The best course is a balance between the time required to write a very
 general solution and the time required to write less general solutions for
 multiple problem types and environments. The best way to do this is to
 choose representative case studies to solve and make sure the solutions are
 truth-tropic and justified for the environments they are to be applied.

 Dave


 On Sun, Jun 27, 2010 at 1:31 AM, David Jones davidher...@gmail.comwrote:

 A method for comparing hypotheses in explanatory-based reasoning: *

 We prefer the hypothesis or explanation that ***expects* more
 observations. If both explanations expect the same observations, then the
 simpler of the two is preferred (because the unnecessary terms of the more
 complicated explanation do not add to the predictive power).*

 *Why are expected events so important?* They are a measure of 1)
 explanatory power and 2) predictive power. The more predictive and the more
 explanatory a hypothesis is, the more likely the hypothesis is when compared
 to a competing hypothesis.

 Here are two case studies I've been analyzing from sensory perception of
 simplified visual input:
 The goal of the case studies is to answer the following: How do you
 generate the most likely motion hypothesis in a way that is general 

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-08 Thread Abram Demski
David,

How I'd present the problem would be predict the next frame, or more
generally predict a specified portion of video given a different portion. Do
you object to this approach?

--Abram

On Thu, Jul 8, 2010 at 5:30 PM, David Jones davidher...@gmail.com wrote:

 It may not be possible to create a learning algorithm that can learn how to
 generally process images and other general AGI problems. This is for the
 same reason that completely general vision algorithms are likely impossible.
 I think that figuring out how to process sensory information intelligently
 requires either 1) impossible amounts of processing or 2) intelligent design
 and understanding by us.

 Maybe you could be more specific about how general learning algorithms
 would solve problems such as the one I'm tackling. But, I am extremely
 doubtful it can be done because the problems cannot be effectively described
 to such an algorithm. If you can't describe the problem, it can't search for
 solutions. If it can't search for solutions, you're basically stuck with
 evolution type algorithms, which require prohibitory amounts of processing.

 The reason that vision is so important for learning is that sensory
 perception is the foundation required to learn everything else. If you don't
 start with a foundational problem like this, you won't be representing the
 real nature of general intelligence problems that require extensive
 knowledge of the world to solve properly. Sensory perception is required to
 learn the information needed to understand everything else. Text and
 language for example, require extensive knowledge about the world to
 understand and especially to learn about. If you start with general learning
 algorithms on these unrepresentative problems, you will get stuck as we
 already have.

 So, it still makes a lot of sense to start with a concrete problem that
 does not require extensive amounts of previous knowledge to start learning.
 In fact, AGI requires that you not pre-program the AI with such extensive
 knowledge. So, lots of people are working on general learning algorithms
 that are unrepresentative of what is required for AGI because the algorithms
 don't have the knowledge needed to learn what they are trying to learn
 about. Regardless of how you look at it, my approach is definitely the right
 approach to AGI in my opinion.



 On Thu, Jul 8, 2010 at 5:02 PM, Abram Demski abramdem...@gmail.comwrote:

 David,

 That's why, imho, the rules need to be *learned* (and, when need be,
 unlearned). IE, what we need to work on is general learning algorithms, not
 general visual processing algorithms.

 As you say, there's not even such a thing as a general visual processing
 algorithm. Learning algorithms suffer similar environment-dependence, but
 (by their nature) not as severe...

 --Abram

 On Thu, Jul 8, 2010 at 3:17 PM, David Jones davidher...@gmail.comwrote:

 I've learned something really interesting today. I realized that general
 rules of inference probably don't really exists. There is no such thing as
 complete generality for these problems. The rules of inference that work for
 one environment would fail in alien environments.

 So, I have to modify my approach to solving these problems. As I studied
 over simplified problems, I realized that there are probably an infinite
 number of environments with their own behaviors that are not representative
 of the environments we want to put a general AI in.

 So, it is not ok to just come up with any case study and solve it. The
 case study has to actually be representative of a problem we want to solve
 in an environment we want to apply AI. Otherwise the solution required will
 take too long to develop because of it tries to accommodate too much
 generality. As I mentioned, such a general solution is likely impossible.
 So, someone could easily get stuck trying to solve an impossible task of
 creating one general solution to too many problems that don't allow a
 general solution.

 The best course is a balance between the time required to write a very
 general solution and the time required to write less general solutions for
 multiple problem types and environments. The best way to do this is to
 choose representative case studies to solve and make sure the solutions are
 truth-tropic and justified for the environments they are to be applied.

 Dave


 On Sun, Jun 27, 2010 at 1:31 AM, David Jones davidher...@gmail.comwrote:

 A method for comparing hypotheses in explanatory-based reasoning: *

 We prefer the hypothesis or explanation that ***expects* more
 observations. If both explanations expect the same observations, then the
 simpler of the two is preferred (because the unnecessary terms of the more
 complicated explanation do not add to the predictive power).*

 *Why are expected events so important?* They are a measure of 1)
 explanatory power and 2) predictive power. The more predictive and the more
 explanatory a hypothesis is, the more likely the 

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-08 Thread David Jones
Abram,

Yeah, I would have to object for a couple reasons.

First, prediction requires previous knowledge. So, even if you make that
your primary goal, you're still going to have my research goals as the
prerequisite: which are to process visual information in a more general way
and learn about the environment in a more general way.

Second, not everything is predictable. Certainly, we should not try to
predict everything. Only after we have experience, can we actually predict
anything. Even then, it's not precise prediction, like predicting the next
frame of a video. It's more like having knowledge of what is quite likely to
occur, or maybe an approximate prediction, but not guaranteed in the least.
For example, based on previous experience, striking a match will light it.
But, sometimes it doesn't light, and that too is expected to occur
sometimes. We definitely don't predict the next image we'll see when it
lights though. We just have expectations for what we might see and this
helps us interpret the image effectively. We should try to expect certain
outcomes or possible outcomes though. You could call that prediction, but
it's not quite the same. The things we are more likely to see should be
attempted as an explanation first and preferred if not given a reason to
think otherwise.


Dave


On Thu, Jul 8, 2010 at 5:51 PM, Abram Demski abramdem...@gmail.com wrote:

 David,

 How I'd present the problem would be predict the next frame, or more
 generally predict a specified portion of video given a different portion. Do
 you object to this approach?

 --Abram

 On Thu, Jul 8, 2010 at 5:30 PM, David Jones davidher...@gmail.com wrote:

 It may not be possible to create a learning algorithm that can learn how
 to generally process images and other general AGI problems. This is for the
 same reason that completely general vision algorithms are likely impossible.
 I think that figuring out how to process sensory information intelligently
 requires either 1) impossible amounts of processing or 2) intelligent design
 and understanding by us.

 Maybe you could be more specific about how general learning algorithms
 would solve problems such as the one I'm tackling. But, I am extremely
 doubtful it can be done because the problems cannot be effectively described
 to such an algorithm. If you can't describe the problem, it can't search for
 solutions. If it can't search for solutions, you're basically stuck with
 evolution type algorithms, which require prohibitory amounts of processing.

 The reason that vision is so important for learning is that sensory
 perception is the foundation required to learn everything else. If you don't
 start with a foundational problem like this, you won't be representing the
 real nature of general intelligence problems that require extensive
 knowledge of the world to solve properly. Sensory perception is required to
 learn the information needed to understand everything else. Text and
 language for example, require extensive knowledge about the world to
 understand and especially to learn about. If you start with general learning
 algorithms on these unrepresentative problems, you will get stuck as we
 already have.

 So, it still makes a lot of sense to start with a concrete problem that
 does not require extensive amounts of previous knowledge to start learning.
 In fact, AGI requires that you not pre-program the AI with such extensive
 knowledge. So, lots of people are working on general learning algorithms
 that are unrepresentative of what is required for AGI because the algorithms
 don't have the knowledge needed to learn what they are trying to learn
 about. Regardless of how you look at it, my approach is definitely the right
 approach to AGI in my opinion.



 On Thu, Jul 8, 2010 at 5:02 PM, Abram Demski abramdem...@gmail.comwrote:

 David,

 That's why, imho, the rules need to be *learned* (and, when need be,
 unlearned). IE, what we need to work on is general learning algorithms, not
 general visual processing algorithms.

 As you say, there's not even such a thing as a general visual processing
 algorithm. Learning algorithms suffer similar environment-dependence, but
 (by their nature) not as severe...

 --Abram

 On Thu, Jul 8, 2010 at 3:17 PM, David Jones davidher...@gmail.comwrote:

 I've learned something really interesting today. I realized that general
 rules of inference probably don't really exists. There is no such thing as
 complete generality for these problems. The rules of inference that work 
 for
 one environment would fail in alien environments.

 So, I have to modify my approach to solving these problems. As I studied
 over simplified problems, I realized that there are probably an infinite
 number of environments with their own behaviors that are not representative
 of the environments we want to put a general AI in.

 So, it is not ok to just come up with any case study and solve it. The
 case study has to actually be representative of a problem we 

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-08 Thread Mike Tintner
Isn't the first problem simply to differentiate the objects in a scene?  (Maybe 
the most important movement to begin with is not  the movement of the object, 
but of the viewer changing their POV if only slightly  - wh. won't be a factor 
if you're looking at a screen)

And that I presume comes down to being able to put a crude, highly tentative, 
and fluid outline round them (something that won't be neces. if you're dealing 
with squares?) . Without knowing v. little if anything about what kind of 
objects they are. As an infant most likely does. {See infants' drawings and how 
they evolve v. gradually from a v. crude outline blob that at first can 
represent anything - that I'm suggesting is a replay of how visual perception 
developed).

The fluid outline or image schema is arguably the basis of all intelligence - 
just about everything AGI is based on it.  You need an outline for instance not 
just of objects, but of where you're going, and what you're going to try and do 
- if you want to survive in the real world.  Schemas connect everything AGI.

And it's not a matter of choice - first you have to have an outline/sense of 
the whole - whatever it is -  before you can start filling in the parts.

P.S. It would be mindblowingly foolish BTW to think you can do better than the 
way an infant learns to see - that's an awfully big visual section of the brain 
there, and it works.


David,

How I'd present the problem would be predict the next frame, or more 
generally predict a specified portion of video given a different portion. Do 
you object to this approach?

--Abram


On Thu, Jul 8, 2010 at 5:30 PM, David Jones davidher...@gmail.com wrote:

  It may not be possible to create a learning algorithm that can learn how to 
generally process images and other general AGI problems. This is for the same 
reason that completely general vision algorithms are likely impossible. I think 
that figuring out how to process sensory information intelligently requires 
either 1) impossible amounts of processing or 2) intelligent design and 
understanding by us. 

  Maybe you could be more specific about how general learning algorithms would 
solve problems such as the one I'm tackling. But, I am extremely doubtful it 
can be done because the problems cannot be effectively described to such an 
algorithm. If you can't describe the problem, it can't search for solutions. If 
it can't search for solutions, you're basically stuck with evolution type 
algorithms, which require prohibitory amounts of processing.

  The reason that vision is so important for learning is that sensory 
perception is the foundation required to learn everything else. If you don't 
start with a foundational problem like this, you won't be representing the real 
nature of general intelligence problems that require extensive knowledge of the 
world to solve properly. Sensory perception is required to learn the 
information needed to understand everything else. Text and language for 
example, require extensive knowledge about the world to understand and 
especially to learn about. If you start with general learning algorithms on 
these unrepresentative problems, you will get stuck as we already have.

  So, it still makes a lot of sense to start with a concrete problem that does 
not require extensive amounts of previous knowledge to start learning. In fact, 
AGI requires that you not pre-program the AI with such extensive knowledge. So, 
lots of people are working on general learning algorithms that are 
unrepresentative of what is required for AGI because the algorithms don't have 
the knowledge needed to learn what they are trying to learn about. Regardless 
of how you look at it, my approach is definitely the right approach to AGI in 
my opinion.




  On Thu, Jul 8, 2010 at 5:02 PM, Abram Demski abramdem...@gmail.com wrote:

David,

That's why, imho, the rules need to be *learned* (and, when need be, 
unlearned). IE, what we need to work on is general learning algorithms, not 
general visual processing algorithms.

As you say, there's not even such a thing as a general visual processing 
algorithm. Learning algorithms suffer similar environment-dependence, but (by 
their nature) not as severe...

--Abram


On Thu, Jul 8, 2010 at 3:17 PM, David Jones davidher...@gmail.com wrote:

  I've learned something really interesting today. I realized that general 
rules of inference probably don't really exists. There is no such thing as 
complete generality for these problems. The rules of inference that work for 
one environment would fail in alien environments. 

  So, I have to modify my approach to solving these problems. As I studied 
over simplified problems, I realized that there are probably an infinite number 
of environments with their own behaviors that are not representative of the 
environments we want to put a general AI in. 

  So, it is not ok to just come up with any case study and solve it. The 
case 

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-04 Thread Jim Bromer
I figured out a way to make the Solomonoff Induction iteratively infinite,
so I guess I was wrong.  Thanks for explaining it to me.  However, I don't
accept that it is feasible to make those calculations since an examination
of the infinite programs that could output each individual string would be
required.

My sense is that the statistics of a examination of a finite number of
programs that output a finite number of strings could be used in Solomonff
Induction to to give a reliable probability of what the next bit (or next
sequence of bits) might be based on the sampling, under the condition that
only those cases that had previously occurred would occur again and at the
same frequencyy during the samplings.  However, the attempt to figure the
probabilities of concatenation of these strings or sub strings would be
unreliable and void whatever benefit the theoretical model might appear to
offer.

Logic, probability and compression methods are all useful in AGI even though
we are constantly violating the laws of logic and probability because it is
necessary, and we sometimes need to use more complicated models
(anti-compression so to speak) so that we can consider other possibilities
based on what we have previously learned.  So, I still don't see how
Kolmogrov Complexity and Solomonoff Induction are truly useful except as
theoretical methods that are interesting to consider.

And, Occam's Razor is not reliable as an axiom of science.  If we were to
abide by it we would come to conclusions like a finding that describes an
event by saying that it occurs some of the time, since it would be simpler
than trying to describe the greater circumstances of the event in an effort
to see if we can find something out about why the event occurred or didn't
occur.  In this sense Occam's Razor is anti-science since it implies that
the status quo should be maintained since simpler is better.  All things
being equal, simpler is better.  I think we all get that.  However, the
human mind is capable of re weighting the conditions and circumstances of a
system to reconsider other possibilities and that seems to be an important
and necessary method in research (and in planning).

Jim Bromer

On Sat, Jul 3, 2010 at 11:39 AM, Matt Mahoney matmaho...@yahoo.com wrote:

   Jim Bromer wrote:
  You can't assume a priori that the diagonal argument is not relevant.

 When I say infinite in my proof of Solomonoff induction, I mean countably
 infinite, as in aleph-null, as in there is a 1 to 1 mapping between the set
 and N, the set of natural numbers. There are a countably infinite number of
 finite strings, or of finite programs, or of finite length descriptions of
 any particular string. For any finite length string or program or
 description x with nonzero probability, there are a countably infinite
 number of finite length strings or programs or descriptions that are longer
 and less likely than x, and a finite number of finite length strings or
 programs or descriptions that are either shorter or more likely or both than
 x.

 Aleph-null is larger than any finite integer. This means that for any
 finite set and any countably infinite set, there is not a 1 to 1 mapping
 between the elements, and if you do map all of the elements of the finite
 set to elements of the infinite set, then there are unmapped elements of the
 infinite set left over.

 Cantor's diagonalization argument proves that there are infinities larger
 than aleph-null, such as the cardinality of the set of real numbers, which
 we call uncountably infinite. But since I am not using any uncountably
 infinite sets, I don't understand your objection.


 -- Matt Mahoney, matmaho...@yahoo.com


  --
 *From:* Jim Bromer jimbro...@gmail.com
 *To:* agi agi@v2.listbox.com
 *Sent:* Sat, July 3, 2010 9:43:15 AM

 *Subject:* Re: [agi] Re: Huge Progress on the Core of AGI

 On Fri, Jul 2, 2010 at 6:08 PM, Matt Mahoney matmaho...@yahoo.com wrote:

   Jim, to address all of your points,

 Solomonoff induction claims that the probability of a string is
 proportional to the number of programs that output the string, where each
 program M is weighted by 2^-|M|. The probability is dominated by the
 shortest program (Kolmogorov complexity), but it is not exactly the same.
 The difference is small enough that we may neglect it, just as we neglect
 differences that depend on choice of language.



 The infinite number of programs that could output the infinite number of
 strings that are to be considered (for example while using Solomonoff
 induction to predict what string is being output) lays out the potential
 for the diagonal argument.  You can't assume a priori that the diagonal
 argument is not relevant.  I don't believe that you can prove that it isn't
 relevant since as you say, Kolmogorov Complexity is not computable, and you
 cannot be sure that you have listed all the programs that were able to
 output a particular string. This creates a situation

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-03 Thread Jim Bromer
On Fri, Jul 2, 2010 at 6:08 PM, Matt Mahoney matmaho...@yahoo.com wrote:

   Jim, to address all of your points,

 Solomonoff induction claims that the probability of a string is
 proportional to the number of programs that output the string, where each
 program M is weighted by 2^-|M|. The probability is dominated by the
 shortest program (Kolmogorov complexity), but it is not exactly the same.
 The difference is small enough that we may neglect it, just as we neglect
 differences that depend on choice of language.



The infinite number of programs that could output the infinite number of
strings that are to be considered (for example while using Solomonoff
induction to predict what string is being output) lays out the potential
for the diagonal argument.  You can't assume a priori that the diagonal
argument is not relevant.  I don't believe that you can prove that it isn't
relevant since as you say, Kolmogorov Complexity is not computable, and you
cannot be sure that you have listed all the programs that were able to
output a particular string. This creates a situation in which the underlying
logic of using Solmonoff induction is based on incomputable reasoning which
can be shown using the diagonal argument.

This kind of criticism cannot be answered with the kinds of presumptions
that you used to derive the conclusions that you did.  It has to be answered
directly.  I can think of other infinity to infinity relations in which the
potential mappings can be countably derived from the formulas or equations,
but I have yet to see any analysis which explains why this usage can be.
Although you may imagine that the summation of the probabilities can be used
just like it was an ordinary number, the unchecked usage is faulty.  In
other words the criticism has to be considered more carefully by someone
capable of dealing with complex mathematical problems that involve the
legitimacy of claims between infinite to infinite mappings.

Jim Bromer



On Fri, Jul 2, 2010 at 6:08 PM, Matt Mahoney matmaho...@yahoo.com wrote:

   Jim, to address all of your points,

 Solomonoff induction claims that the probability of a string is
 proportional to the number of programs that output the string, where each
 program M is weighted by 2^-|M|. The probability is dominated by the
 shortest program (Kolmogorov complexity), but it is not exactly the same.
 The difference is small enough that we may neglect it, just as we neglect
 differences that depend on choice of language.

 Here is the proof that Kolmogorov complexity is not computable. Suppose it
 were. Then I could test the Kolmogorov complexity of strings in increasing
 order of length (breaking ties lexicographically) and describe the first
 string that cannot be described in less than a million bits, contradicting
 the fact that I just did. (Formally, I could write a program that outputs
 the first string whose Kolmogorov complexity is at least n bits, choosing n
 to be larger than my program).

 Here is the argument that Occam's Razor and Solomonoff distribution must be
 true. Consider all possible probability distributions p(x) over any infinite
 set X of possible finite strings x, i.e. any X = {x: p(x)  0} that is
 infinite. All such distributions must favor shorter strings over longer
 ones. Consider any x in X. Then p(x)  0. There can be at most a finite
 number (less than 1/p(x)) of strings that are more likely than x, and
 therefore an infinite number of strings which are less likely than x. Of
 this infinite set, only a finite number (less than 2^|x|) can be shorter
 than x, and therefore there must be an infinite number that are longer than
 x. So for each x we can partition X into 4 subsets as follows:

 - shorter and more likely than x: finite
 - shorter and less likely than x: finite
 - longer and more likely than x: finite
 - longer and less likely than x: infinite.

 So in this sense, any distribution over the set of strings must favor
 shorter strings over longer ones.


 -- Matt Mahoney, matmaho...@yahoo.com


  --
 *From:* Jim Bromer jimbro...@gmail.com
 *To:* agi agi@v2.listbox.com
 *Sent:* Fri, July 2, 2010 4:09:38 PM

 *Subject:* Re: [agi] Re: Huge Progress on the Core of AGI



 On Fri, Jul 2, 2010 at 2:25 PM, Jim Bromer jimbro...@gmail.com wrote:

There cannot be a one to one correspondence to the representation of
 the shortest program to produce a string and the strings that they produce.
 This means that if the consideration of the hypotheses were to be put into
 general mathematical form it must include the potential of many to one
 relations between candidate programs (or subprograms) and output strings.



 But, there is also no way to determine what the shortest program is,
 since there may be different programs that are the same length.  That means
 that there is a many to one relation between programs and program length.
 So the claim that you could just iterate through programs *by length* is
 false

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-03 Thread Jim Bromer
This group, as in most AGI discussions, will use logic and statistical
theory loosely.  We have to.  One is that we - thinking entities - do not
know everything and so our reasoning is based on fragmentary knowledge.  In
this situation the boundaries of logical reasoning in thought, both natural
and artificial, are going to be transgressed.  However, knowing that is
going to be the case in AGI, we can acknowledge it and try to figure out
algorithms that will tend to ground our would-be programs.

Now Solomonoff Induction and Algorithmic Information Theory are a little
different.  They deal with concrete data spaces.  We can and should question
how relevant those concrete sample spaces might be to general reasoning
about the greater universe of knowledge, but the fact that they deal with
concrete spaces means that they might be logically bound.  But are they?  If
an idealism is both concrete (too concrete for our uses) and not logically
computable then we have to really be wary of trying to use it.

If using Solomonoff Induction is incomputable it does not prove that it is
illogical.  But if it is incomputable, it would be illogical to believe that
it can be used reliably.

Solomonoff Induction has been around long enough for serious mathematicians
to examine its validity.  If it was a genuinely sound method, mathematicians
would have accepted it.  However, if Solomonoff Induction is incomputable in
practice it would be so unreliable that top mathematicians would tend to
choose more productive and interesting subjects to study.  As far as I can
tell, Solomonoff Induction exists today within the backwash of AI
communities.  It has found new life in these kinds of discussion groups
where most of us do not have the skill or the time to critically examine the
basis of every theory that is put forward.  The one test that we can make is
whether or not some method that is being presented has some reliability in
our programs which constitute mini experiments.  Logic and probability pass
the smell test, even though we know that our use of them in AGI is not
ideal.

Jim Bromer



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-03 Thread Matt Mahoney
Jim Bromer wrote:
 You can't assume a priori that the diagonal argument is not relevant. 

When I say infinite in my proof of Solomonoff induction, I mean countably 
infinite, as in aleph-null, as in there is a 1 to 1 mapping between the set and 
N, the set of natural numbers. There are a countably infinite number of finite 
strings, or of finite programs, or of finite length descriptions of any 
particular string. For any finite length string or program or description x 
with nonzero probability, there are a countably infinite number of finite 
length strings or programs or descriptions that are longer and less likely than 
x, and a finite number of finite length strings or programs or descriptions 
that are either shorter or more likely or both than x.

Aleph-null is larger than any finite integer. This means that for any finite 
set and any countably infinite set, there is not a 1 to 1 mapping between the 
elements, and if you do map all of the elements of the finite set to elements 
of the infinite set, then there are unmapped elements of the infinite set left 
over.

Cantor's diagonalization argument proves that there are infinities larger than 
aleph-null, such as the cardinality of the set of real numbers, which we call 
uncountably infinite. But since I am not using any uncountably infinite sets, I 
don't understand your objection.

 -- Matt Mahoney, matmaho...@yahoo.com





From: Jim Bromer jimbro...@gmail.com
To: agi agi@v2.listbox.com
Sent: Sat, July 3, 2010 9:43:15 AM
Subject: Re: [agi] Re: Huge Progress on the Core of AGI

On Fri, Jul 2, 2010 at 6:08 PM, Matt Mahoney matmaho...@yahoo.com wrote:

Jim, to address all of your points,


Solomonoff induction claims that the probability of a string is proportional 
to the number of programs that output the string, where each program M is 
weighted by 2^-|M|. The probability is dominated by the shortest program 
(Kolmogorov complexity), but it is not exactly the same. The difference is 
small enough that we may neglect it, just as we neglect differences that 
depend on choice of language.
 
 
The infinite number of programs that could output the infinite number of 
strings that are to be considered (for example while using Solomonoff induction 
to predict what string is being output) lays out the potential for the 
diagonal argument.  You can't assume a priori that the diagonal argument is not 
relevant.  I don't believe that you can prove that it isn't relevant since as 
you say, Kolmogorov Complexity is not computable, and you cannot be sure that 
you have listed all the programs that were able to output a particular string. 
This creates a situation in which the underlying logic of using Solmonoff 
induction is based on incomputable reasoning which can be shown using the 
diagonal argument.
 
This kind of criticism cannot be answered with the kinds of presumptions that 
you used to derive the conclusions that you did.  It has to be answered 
directly.  I can think of other infinity to infinity relations in which the 
potential mappings can be countably derived from the formulas or equations, but 
I have yet to see any analysis which explains why this usage can be.  Although 
you may imagine that the summation of the probabilities can be used just like 
it was an ordinary number, the unchecked usage is faulty.  In other words the 
criticism has to be considered more carefully by someone capable of dealing 
with complex mathematical problems that involve the legitimacy of claims 
between infinite to infinite mappings.
 
Jim Bromer
 
 
 
On Fri, Jul 2, 2010 at 6:08 PM, Matt Mahoney matmaho...@yahoo.com wrote:

Jim, to address all of your points,


Solomonoff induction claims that the probability of a string is proportional 
to the number of programs that output the string, where each program M is 
weighted by 2^-|M|. The probability is dominated by the shortest program 
(Kolmogorov complexity), but it is not exactly the same. The difference is 
small enough that we may neglect it, just as we neglect differences that 
depend on choice of language.


Here is the proof that Kolmogorov complexity is not computable. Suppose it 
were. Then I could test the Kolmogorov complexity of strings in increasing 
order of length (breaking ties lexicographically) and describe the first 
string that cannot be described in less than a million bits, contradicting 
the fact that I just did. (Formally, I could write a program that outputs the 
first string whose Kolmogorov complexity is at least n bits, choosing n to be 
larger than my program).


Here is the argument that Occam's Razor and Solomonoff distribution must be 
true. Consider all possible probability distributions p(x) over any infinite 
set X of possible finite strings x, i.e. any X = {x: p(x)  0} that is 
infinite. All such distributions must favor shorter strings over longer ones. 
Consider any x in X. Then p(x)  0. There can be at most a finite number (less 
than 1/p(x

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-02 Thread Jim Bromer
On Wed, Jun 30, 2010 at 5:13 PM, Matt Mahoney matmaho...@yahoo.com wrote:

   Jim, what evidence do you have that Occam's Razor ... is wrong, besides
 your own opinions? It is well established that elegant (short) theories are
 preferred in all branches of science because they have greater predictive
 power.



  -- Matt Mahoney, matmaho...@yahoo.com


When a heuristic is used as if it were an axiom of truth, it will interfere
in the development of reasonable insight just because an heuristic is not an
axiom.  Now to apply this heuristic (which does have value) as an
unquestionable axiom of mind, you are making a more egregious claim because
you are multiplying the force of the error.

Occam's razor has greater predictive power within the boundaries of the
isolation experiments which have the greatest potential to enhance its
power.  If simplest theories are preferred because they have the greater
predictive power, then it would follow that isolation experiments would be
the preferred vehicles of science just because they can produce theories
that had the most predictive power.  Whether this is the case or not (the
popular opinion), it does not answer the question of whether narrow AI (for
example) should be the preferred child of computer science just because the
theorems of narrow AI are so much better at predicting their (narrow) events
than the theorems of AGI are at comprehending their (more
complicated) events.

Jim Bromer



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-02 Thread Jim Bromer
On Wed, Jun 30, 2010 at 5:13 PM, Matt Mahoney matmaho...@yahoo.com wrote:

   Jim, what evidence do you have that Occam's Razor or algorithmic
 information theory is wrong,
 Also, what does this have to do with Cantor's diagonalization argument? AIT
 considers only the countably infinite set of hypotheses.


 -- Matt Mahoney, matmaho...@yahoo.com



There cannot be a one to one correspondence to the representation of the
shortest program to produce a string and the strings that they produce.
This means that if the consideration of the hypotheses were to be put into
general mathematical form it must include the potential of many to one
relations between candidate programs (or subprograms) and output strings.



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-02 Thread Jim Bromer
On Fri, Jul 2, 2010 at 2:09 PM, Jim Bromer jimbro...@gmail.com wrote:

  On Wed, Jun 30, 2010 at 5:13 PM, Matt Mahoney matmaho...@yahoo.comwrote:

   Jim, what evidence do you have that Occam's Razor or algorithmic
 information theory is wrong,
 Also, what does this have to do with Cantor's diagonalization argument?
 AIT considers only the countably infinite set of hypotheses.
  -- Matt Mahoney, matmaho...@yahoo.com



  There cannot be a one to one correspondence to the representation of the
 shortest program to produce a string and the strings that they produce.
 This means that if the consideration of the hypotheses were to be put into
 general mathematical form it must include the potential of many to one
 relations between candidate programs (or subprograms) and output strings.


But, there is also no way to determine what the shortest program is, since
there may be different programs that are the same length.  That means that
there is a many to one relation between programs and program length.  So
the claim that you could just iterate through programs *by length* is
false.  This is the goal of algorithmic information theory not a premise
of a methodology that can be used.  So you have the diagonalization problem.



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-02 Thread Jim Bromer
On Fri, Jul 2, 2010 at 2:25 PM, Jim Bromer jimbro...@gmail.com wrote:

There cannot be a one to one correspondence to the representation of
 the shortest program to produce a string and the strings that they produce.
 This means that if the consideration of the hypotheses were to be put into
 general mathematical form it must include the potential of many to one
 relations between candidate programs (or subprograms) and output strings.



 But, there is also no way to determine what the shortest program is,
 since there may be different programs that are the same length.  That means
 that there is a many to one relation between programs and program length.
 So the claim that you could just iterate through programs *by length* is
 false.  This is the goal of algorithmic information theory not a premise
 of a methodology that can be used.  So you have the diagonalization problem.



A counter argument is that there are only a finite number of Turing Machine
programs of a given length.  However, since you guys have specifically
designated that this theorem applies to any construction of a Turing Machine
it is not clear that this counter argument can be used.  And there is still
the specific problem that you might want to try a program that writes a
longer program to output a string (or many strings).  Or you might want to
write a program that can be called to write longer programs on a dynamic
basis.  I think these cases, where you might consider a program that outputs
a longer program, (or another instruction string for another Turing
Machine) constitutes a serious problem, that at the least, deserves to be
answered with sound analysis.

Part of my original intuitive argument, that I formed some years ago, was
that without a heavy constraint on the instructions for the program, it will
be practically impossible to test or declare that some program is indeed the
shortest program.  However, I can't quite get to the point now that I can
say that there is definitely a diagonalization problem.

Jim Bromer



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-02 Thread Matt Mahoney
Jim, to address all of your points,

Solomonoff induction claims that the probability of a string is proportional to 
the number of programs that output the string, where each program M is weighted 
by 2^-|M|. The probability is dominated by the shortest program (Kolmogorov 
complexity), but it is not exactly the same. The difference is small enough 
that we may neglect it, just as we neglect differences that depend on choice of 
language.

Here is the proof that Kolmogorov complexity is not computable. Suppose it 
were. Then I could test the Kolmogorov complexity of strings in increasing 
order of length (breaking ties lexicographically) and describe the first 
string that cannot be described in less than a million bits, contradicting the 
fact that I just did. (Formally, I could write a program that outputs the first 
string whose Kolmogorov complexity is at least n bits, choosing n to be larger 
than my program).

Here is the argument that Occam's Razor and Solomonoff distribution must be 
true. Consider all possible probability distributions p(x) over any infinite 
set X of possible finite strings x, i.e. any X = {x: p(x)  0} that is 
infinite. All such distributions must favor shorter strings over longer ones. 
Consider any x in X. Then p(x)  0. There can be at most a finite number (less 
than 1/p(x)) of strings that are more likely than x, and therefore an infinite 
number of strings which are less likely than x. Of this infinite set, only a 
finite number (less than 2^|x|) can be shorter than x, and therefore there must 
be an infinite number that are longer than x. So for each x we can partition X 
into 4 subsets as follows:

- shorter and more likely than x: finite
- shorter and less likely than x: finite
- longer and more likely than x: finite
- longer and less likely than x: infinite.

So in this sense, any distribution over the set of strings must favor shorter 
strings over longer ones.

-- Matt Mahoney, matmaho...@yahoo.com





From: Jim Bromer jimbro...@gmail.com
To: agi agi@v2.listbox.com
Sent: Fri, July 2, 2010 4:09:38 PM
Subject: Re: [agi] Re: Huge Progress on the Core of AGI




On Fri, Jul 2, 2010 at 2:25 PM, Jim Bromer jimbro...@gmail.com wrote:  
There cannot be a one to one correspondence to the representation of the 
shortest program to produce a string and the strings that they produce.  This 
means that if the consideration of the hypotheses were to be put into general 
mathematical form it must include the potential of many to one relations 
between candidate programs (or subprograms) and output strings.
 
But, there is also no way to determine what the shortest program is, since 
there may be different programs that are the same length.  That means that 
there is a many to one relation between programs and program length.  So the 
claim that you could just iterate through programs by length is false.  This is 
the goal of algorithmic information theory not a premise of a methodology that 
can be used.  So you have the diagonalization problem. 
 
A counter argument is that there are only a finite number of Turing Machine 
programs of a given length.  However, since you guys have specifically 
designated that this theorem applies to any construction of a Turing Machine it 
is not clear that this counter argument can be used.  And there is still the 
specific problem that you might want to try a program that writes a longer 
program to output a string (or many strings).  Or you might want to write a 
program that can be called to write longer programs on a dynamic basis.  I 
think these cases, where you might consider a program that outputs a longer 
program, (or another instruction string for another Turing Machine) constitutes 
a serious problem, that at the least, deserves to be answered with sound 
analysis.
 
Part of my original intuitive argument, that I formed some years ago, was that 
without a heavy constraint on the instructions for the program, it will be 
practically impossible to test or declare that some program is indeed the 
shortest program.  However, I can't quite get to the point now that I can say 
that there is definitely a diagonalization problem.
 
Jim Bromer

agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-02 Thread David Jones
Nice Occam's Razor argument. I understood it simply because I knew there are
always an infinite number of possible explanations for every observation
that are more complicated than the simplest explanation. So, without a
reason to choose one of those other interpretations, then why choose it? You
could look for reasons in complex environments, but it would likely be more
efficient to wait for a reason to need a better explanation. It's more
efficient to wait for an inconsistency than to search an infinite set
without a reason to do so.

Dave

On Fri, Jul 2, 2010 at 6:08 PM, Matt Mahoney matmaho...@yahoo.com wrote:

   Jim, to address all of your points,

 Solomonoff induction claims that the probability of a string is
 proportional to the number of programs that output the string, where each
 program M is weighted by 2^-|M|. The probability is dominated by the
 shortest program (Kolmogorov complexity), but it is not exactly the same.
 The difference is small enough that we may neglect it, just as we neglect
 differences that depend on choice of language.

 Here is the proof that Kolmogorov complexity is not computable. Suppose it
 were. Then I could test the Kolmogorov complexity of strings in increasing
 order of length (breaking ties lexicographically) and describe the first
 string that cannot be described in less than a million bits, contradicting
 the fact that I just did. (Formally, I could write a program that outputs
 the first string whose Kolmogorov complexity is at least n bits, choosing n
 to be larger than my program).

 Here is the argument that Occam's Razor and Solomonoff distribution must be
 true. Consider all possible probability distributions p(x) over any infinite
 set X of possible finite strings x, i.e. any X = {x: p(x)  0} that is
 infinite. All such distributions must favor shorter strings over longer
 ones. Consider any x in X. Then p(x)  0. There can be at most a finite
 number (less than 1/p(x)) of strings that are more likely than x, and
 therefore an infinite number of strings which are less likely than x. Of
 this infinite set, only a finite number (less than 2^|x|) can be shorter
 than x, and therefore there must be an infinite number that are longer than
 x. So for each x we can partition X into 4 subsets as follows:

 - shorter and more likely than x: finite
 - shorter and less likely than x: finite
 - longer and more likely than x: finite
 - longer and less likely than x: infinite.

 So in this sense, any distribution over the set of strings must favor
 shorter strings over longer ones.


 -- Matt Mahoney, matmaho...@yahoo.com


  --
 *From:* Jim Bromer jimbro...@gmail.com
 *To:* agi agi@v2.listbox.com
 *Sent:* Fri, July 2, 2010 4:09:38 PM

 *Subject:* Re: [agi] Re: Huge Progress on the Core of AGI



 On Fri, Jul 2, 2010 at 2:25 PM, Jim Bromer jimbro...@gmail.com wrote:

There cannot be a one to one correspondence to the representation of
 the shortest program to produce a string and the strings that they produce.
 This means that if the consideration of the hypotheses were to be put into
 general mathematical form it must include the potential of many to one
 relations between candidate programs (or subprograms) and output strings.



 But, there is also no way to determine what the shortest program is,
 since there may be different programs that are the same length.  That means
 that there is a many to one relation between programs and program length.
 So the claim that you could just iterate through programs *by length* is
 false.  This is the goal of algorithmic information theory not a premise
 of a methodology that can be used.  So you have the diagonalization problem.



 A counter argument is that there are only a finite number of Turing Machine
 programs of a given length.  However, since you guys have specifically
 designated that this theorem applies to any construction of a Turing Machine
 it is not clear that this counter argument can be used.  And there is still
 the specific problem that you might want to try a program that writes a
 longer program to output a string (or many strings).  Or you might want to
 write a program that can be called to write longer programs on a dynamic
 basis.  I think these cases, where you might consider a program that outputs
 a longer program, (or another instruction string for another Turing
 Machine) constitutes a serious problem, that at the least, deserves to be
 answered with sound analysis.

 Part of my original intuitive argument, that I formed some years ago, was
 that without a heavy constraint on the instructions for the program, it will
 be practically impossible to test or declare that some program is indeed the
 shortest program.  However, I can't quite get to the point now that I can
 say that there is definitely a diagonalization problem.

 Jim Bromer

   *agi* | Archives https://www.listbox.com/member/archive/303/=now
 https://www.listbox.com/member/archive/rss/303

Re: [agi] Re: Huge Progress on the Core of AGI

2010-06-30 Thread Jim Bromer
Cantor's diagonal argument is (in all likelihood) mathematically correct.
However the attempt to use Cantor's methodology to derive an irrational
number that is the next greater irrational number from a given irrational
number (to a degree of precision sufficient to distinguish the two numbers)
is not mathematically correct.  If you were to say that Cantor's argument
was mathematically correct, I would agree with you.  As far as I can tell it
is.  However, if you were then to use his method of enumerating irrational
numbers as a means to discover subsequent irrational numbers, I would not
conclude that you understand what it means to say that Cantor's diagonal
argument was mathematically correct.  (However, I am not a mathematician and
I might be wrong in some ways.)

Jim Bromer



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com


Re: [agi] Re: Huge Progress on the Core of AGI

2010-06-30 Thread Abram Demski
Jim,

Well, like I said, it'll only probably lead you to accept AIT. :) In my
case, it led me to accept AIT but not AIXI, with reasons somewhat similar to
the ones Steve recently mentioned.

I agree that there is not a perfect equivalence; the math here is subtle.
Just saying it's equivalent glosses over many details...

--Abram

On Wed, Jun 30, 2010 at 9:13 AM, Jim Bromer jimbro...@gmail.com wrote:

 On Tue, Jun 29, 2010 at 11:46 PM, Abram Demski abramdem...@gmail.comwrote:
 In brief, the answer to your question is: we formalize the description
 length heuristic by assigning lower probabilities to longer hypotheses, and
 we apply Bayes law to update these probabilities given the data we observe.
 This updating captures the idea that we should reward theories which
 explain/expect more of the observations; it also provides a natural way to
 balance simplicity vs explanatory power, so that we can compare any two
 theories with a single scoring mechanism. Bayes Law automatically places the
 right amount of pressure to avoid overly elegant explanations which don't
 get much right, and to avoid overly complex explanations which fit the
 observations perfectly but which probably won't generalize to new data.
 ...
 If you go down this path, you will eventually come to understand (and,
 probably, accept) algorithmic information theory. Matt may be tring to force
 it on you too soon. :)
 --Abram

 David was asking about theories of explanation, and here you are suggesting
 that following a certain path of reasoning will lead to accepting AIT.  What
 nonsense.  Even assuming that Baye's law can be used to update probabilities
 of idealized utility, the connection between description length and
 explanatory power in general AI is tenuous.  And when you realize that AIT
 is an unattainable idealism that lacks mathematical power (I do not believe
 that it is a valid mathematical method because it is incomputable and
 therefore innumerable and cannot be used to derive probability distributions
 even as ideals) you have to accept that the connection between explanatory
 theories and AIT is not established except as a special case based on the
 imagination that a similarities between a subclass of practical examples is
 the same as a powerful generalization of those examples.

 The problem is that while compression seems to be related to intelligence,
 it is not equivalent to intelligence.  A much stronger but similarly false
 argument is that memory is intelligence.  Of course memory is a major part
 of intelligence, but it is not everything.  The argument that AIT is a
 reasonable substitute for developing more sophisticated theories about
 conceptual explanation is not well founded, it lacks any experimental
 evidence other than a spattering of results on simplistic cases, and it is
 just wrong to suggest that there is no reason to consider other theories of
 explanation.

 Yes compression has something to do with intelligence and, in some special
 cases it can be shown to act as an idealism for numerical rationality.  And
 yes unattainable theories that examine the boundaries of productive
 mathematical systems is a legitimate subject for mathematics.  But there is
 so much more to theories of explanatory reasoning that I genuinely feel
 sorry for those of you, who originally motivated to develop better AGI
 programs, would get caught in the obvious traps of AIT and AIXI.

 Jim Bromer


 On Tue, Jun 29, 2010 at 11:46 PM, Abram Demski abramdem...@gmail.comwrote:

 David,

 What Matt is trying to explain is all right, but I think a better way of
 answering your question would be to invoke the mighty mysterious Bayes' Law.

 I had an epiphany similar to yours (the one that started this thread)
 about 5 years ago now. At the time I did not know that it had all been done
 before. I think many people feel this way about MDL. Looking into the MDL
 (minimum description length) literature would be a good starting point.

 In brief, the answer to your question is: we formalize the description
 length heuristic by assigning lower probabilities to longer hypotheses, and
 we apply Bayes law to update these probabilities given the data we observe.
 This updating captures the idea that we should reward theories which
 explain/expect more of the observations; it also provides a natural way to
 balance simplicity vs explanatory power, so that we can compare any two
 theories with a single scoring mechanism. Bayes Law automatically places the
 right amount of pressure to avoid overly elegant explanations which don't
 get much right, and to avoid overly complex explanations which fit the
 observations perfectly but which probably won't generalize to new data.

 Bayes' Law and MDL have strong connections, though sometimes they part
 ways. There are deep theorems here. For me it's good enough to note that if
 we're using a maximally efficient code for our knowledge representation,
 they are equivalent. (This in itself involves some deep 

Re: [agi] Re: Huge Progress on the Core of AGI

2010-06-30 Thread Matt Mahoney
Jim, what evidence do you have that Occam's Razor or algorithmic information 
theory is wrong, besides your own opinions? It is well established that elegant 
(short) theories are preferred in all branches of science because they have 
greater predictive power.

Also, what does this have to do with Cantor's diagonalization argument? AIT 
considers only the countably infinite set of hypotheses.

 -- Matt Mahoney, matmaho...@yahoo.com





From: Jim Bromer jimbro...@gmail.com
To: agi agi@v2.listbox.com
Sent: Wed, June 30, 2010 9:13:44 AM
Subject: Re: [agi] Re: Huge Progress on the Core of AGI


On Tue, Jun 29, 2010 at 11:46 PM, Abram Demski abramdem...@gmail.com wrote:
In brief, the answer to your question is: we formalize the description length 
heuristic by assigning lower probabilities to longer hypotheses, and we apply 
Bayes law to update these probabilities given the data we observe. This 
updating captures the idea that we should reward theories which explain/expect 
more of the observations; it also provides a natural way to balance simplicity 
vs explanatory power, so that we can compare any two theories with a single 
scoring mechanism. Bayes Law automatically places the right amount of pressure 
to avoid overly elegant explanations which don't get much right, and to avoid 
overly complex explanations which fit the observations perfectly but which 
probably won't generalize to new data.
...
If you go down this path, you will eventually come to understand (and, 
probably, accept) algorithmic information theory. Matt may be tring to force it 
on you too soon. :)
--Abram 
 
David was asking about theories of explanation, and here you are suggesting 
that following a certain path of reasoning will lead to accepting AIT.  What 
nonsense.  Even assuming that Baye's law can be used to update probabilities of 
idealized utility, the connection between description length and explanatory 
power in general AI is tenuous.  And when you realize that AIT is an 
unattainable idealism that lacks mathematical power (I do not believe that it 
is a valid mathematical method because it is incomputable and therefore 
innumerable and cannot be used to derive probability distributions even as 
ideals) you have to accept that the connection between explanatory theories and 
AIT is not established except as a special case based on the imagination that a 
similarities between a subclass of practical examples is the same as a powerful 
generalization of those examples.  
 
The problem is that while compression seems to be related to intelligence, it 
is not equivalent to intelligence.  A much stronger but similarly false 
argument is that memory is intelligence.  Of course memory is a major part of 
intelligence, but it is not everything.  The argument that AIT is a reasonable 
substitute for developing more sophisticated theories about conceptual 
explanation is not well founded, it lacks any experimental evidence other than 
a spattering of results on simplistic cases, and it is just wrong to suggest 
that there is no reason to consider other theories of explanation.
 
Yes compression has something to do with intelligence and, in some special 
cases it can be shown to act as an idealism for numerical rationality.  And yes 
unattainable theories that examine the boundaries of productive mathematical 
systems is a legitimate subject for mathematics.  But there is so much more to 
theories of explanatory reasoning that I genuinely feel sorry for those of you, 
who originally motivated to develop better AGI programs, would get caught in 
the obvious traps of AIT and AIXI.
 
Jim Bromer 

 
On Tue, Jun 29, 2010 at 11:46 PM, Abram Demski abramdem...@gmail.com wrote:

David,

What Matt is trying to explain is all right, but I think a better way of 
answering your question would be to invoke the mighty mysterious Bayes' Law.

I had an epiphany similar to yours (the one that started this thread) about 5 
years ago now. At the time I did not know that it had all been done before. I 
think many people feel this way about MDL. Looking into the MDL (minimum 
description length) literature would be a good starting point.

In brief, the answer to your question is: we formalize the description length 
heuristic by assigning lower probabilities to longer hypotheses, and we apply 
Bayes law to update these probabilities given the data we observe. This 
updating captures the idea that we should reward theories which explain/expect 
more of the observations; it also provides a natural way to balance simplicity 
vs explanatory power, so that we can compare any two theories with a single 
scoring mechanism. Bayes Law automatically places the right amount of pressure 
to avoid overly elegant explanations which don't get much right, and to avoid 
overly complex explanations which fit the observations perfectly but which 
probably won't generalize to new data.

Bayes' Law and MDL have strong connections, though

Re: [agi] Re: Huge Progress on the Core of AGI

2010-06-29 Thread Matt Mahoney
David Jones wrote:
 If anyone has any knowledge of or references to the state of the art in 
 explanation-based reasoning, can you send me keywords or links? 

The simplest explanation of the past is the best predictor of the future.
http://en.wikipedia.org/wiki/Occam's_razor
http://www.scholarpedia.org/article/Algorithmic_probability

 -- Matt Mahoney, matmaho...@yahoo.com





From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Tue, June 29, 2010 9:05:45 AM
Subject: [agi] Re: Huge Progress on the Core of AGI

If anyone has any knowledge of or references to the state of the art in 
explanation-based reasoning, can you send me keywords or links? I've read some 
through google, but I'm not really satisfied with anything I've found. 

Thanks,

Dave


On Sun, Jun 27, 2010 at 1:31 AM, David Jones davidher...@gmail.com wrote:

A method for comparing hypotheses in explanatory-based reasoning: 

We prefer the hypothesis or explanation that *expects* more observations. If 
both explanations expect the same observations, then the simpler of the two is 
preferred (because the unnecessary terms of the more complicated explanation 
do not add to the predictive power). 

Why are expected events so important? They are a measure of 1) explanatory 
power and 2) predictive power. The more predictive and 
the more explanatory a hypothesis is, the more likely the hypothesis is when 
compared to a competing hypothesis.

Here are two case studies I've been analyzing from sensory perception of 
simplified visual input:


The goal of the case studies is to answer the following: How do you generate 
the most likely motion hypothesis in a way that is 
general and applicable to AGI?
Case Study 1) Here is a link to an example: animated gif of two black squares 
move from left to right. Description: Two black squares are moving in unison 
from left to right across a white screen. In each frame the black squares 
shift to the right so that square 1 steals square 2's original position and 
square two moves an equal distance to the right.
Case Study 2) Here is a link to an example: the interrupted square. 
Description: A single square is moving from left to right. Suddenly in the 
third frame, a single black square is added in the middle of the expected path 
of the original black square. This second square just stays there. So, what 
happened? Did the square moving from left to right keep moving? Or did it stop 
and then another square suddenly appeared and moved from left to right?

Here is a simplified version of how we solve case study 1:
The important hypotheses to consider are: 
1) the square from frame 1 of the video that has a very close position to the 
square from frame 2 should be matched (we hypothesize that they are the same 
square and that any difference in position is motion).  So, what happens is 
that in each two frames of the video, we only match one square. The other 
square goes unmatched.   


2) We do the same thing as in hypothesis #1, but this time we also match the 
remaining squares and hypothesize motion as follows: the first square jumps 
over the second square from left to right. We hypothesize that this happens 
over and over in each frame of the video. Square 2 stops and square 1 jumps 
over it over and over again. 


3) We hypothesize that both squares move to the right in unison. This is the 
correct hypothesis.

So, why should we prefer the correct hypothesis, #3 over the other two?

Well, first of all, #3 is correct because it has the most explanatory power of 
the three and is the simplest of the three. Simpler is better because, with 
the given evidence and information, there is no reason to desire a more 
complicated hypothesis such as #2. 

So, the answer to the question is because explanation #3 expects the most 
observations, such as: 
1) the consistent relative positions of the squares in each frame are 
expected. 
2) It also expects their new positions in each from based on velocity 
calculations. 


3) It expects both squares to occur in each frame. 

Explanation 1 ignores 1 square from each frame of the video, because it can't 
match it. Hypothesis #1 doesn't have a reason for why the a new square appears 
in each frame and why one disappears. It doesn't expect these observations. In 
fact, explanation 1 doesn't expect anything that happens because something new 
happens in each frame, which doesn't give it a chance to confirm its 
hypotheses in subsequent frames.

The power of this method is immediately clear. It is general and it solves the 
problem very cleanly.

Here is a simplified version of how we solve case study 2:
We expect the original square to move at a similar velocity from left to right 
because we hypothesized that it did move from left to right and we calculated 
its velocity. If this expectation is confirmed, then it is more likely than 
saying that the square suddenly stopped and another started moving

Re: [agi] Re: Huge Progress on the Core of AGI

2010-06-29 Thread David Jones
Thanks Matt,

Right. But Occam's Razor is not complete. It says simpler is better, but 1)
this only applies when two hypotheses have the same explanatory power and 2)
what defines simpler?

So, maybe what I want to know from the state of the art in research is:

1) how precisely do other people define simpler
and
2) More importantly, how do you compare competing explanations/hypotheses
that have more or less explanatory power. Simpler does not apply unless you
are comparing equally explanatory hypotheses.

For example, the simplest hypothesis for all visual interpretation is that
everything in the first image is gone in the second image, and everything in
the second image is a new object. Simple. Done. Solved :) right? Well,
clearly a more complicated explanation is warranted because a more
complicated explanation is more *explanatory* and a better explanation. So,
why is it better? Can it be defined as better in a precise way so that you
can compare arbitrary hypotheses or explanations? That is what I'm trying to
learn about. I don't think much progress has been made in this area, but I'd
like to know what other people have done and any successes they've had.

Dave


On Tue, Jun 29, 2010 at 10:29 AM, Matt Mahoney matmaho...@yahoo.com wrote:

 David Jones wrote:
  If anyone has any knowledge of or references to the state of the art in
 explanation-based reasoning, can you send me keywords or links?

 The simplest explanation of the past is the best predictor of the future.
 http://en.wikipedia.org/wiki/Occam's_razorhttp://en.wikipedia.org/wiki/Occam%27s_razor
 http://en.wikipedia.org/wiki/Occam%27s_razor
 http://www.scholarpedia.org/article/Algorithmic_probability
 http://www.scholarpedia.org/article/Algorithmic_probability

 -- Matt Mahoney, matmaho...@yahoo.com


 --
 *From:* David Jones davidher...@gmail.com

 *To:* agi agi@v2.listbox.com
 *Sent:* Tue, June 29, 2010 9:05:45 AM
 *Subject:* [agi] Re: Huge Progress on the Core of AGI

 If anyone has any knowledge of or references to the state of the art in
 explanation-based reasoning, can you send me keywords or links? I've read
 some through google, but I'm not really satisfied with anything I've found.

 Thanks,

 Dave

 On Sun, Jun 27, 2010 at 1:31 AM, David Jones davidher...@gmail.comwrote:

 A method for comparing hypotheses in explanatory-based reasoning: *

 We prefer the hypothesis or explanation that ***expects* more
 observations. If both explanations expect the same observations, then the
 simpler of the two is preferred (because the unnecessary terms of the more
 complicated explanation do not add to the predictive power).*

 *Why are expected events so important?* They are a measure of 1)
 explanatory power and 2) predictive power. The more predictive and the more
 explanatory a hypothesis is, the more likely the hypothesis is when compared
 to a competing hypothesis.

 Here are two case studies I've been analyzing from sensory perception of
 simplified visual input:
 The goal of the case studies is to answer the following: How do you
 generate the most likely motion hypothesis in a way that is general and
 applicable to AGI?
 *Case Study 1)* Here is a link to an example: animated gif of two black
 squares move from left to 
 righthttp://practicalai.org/images/CaseStudy1.gif.
 *Description: *Two black squares are moving in unison from left to right
 across a white screen. In each frame the black squares shift to the right so
 that square 1 steals square 2's original position and square two moves an
 equal distance to the right.
 *Case Study 2) *Here is a link to an example: the interrupted 
 squarehttp://practicalai.org/images/CaseStudy2.gif.
 *Description:* A single square is moving from left to right. Suddenly in
 the third frame, a single black square is added in the middle of the
 expected path of the original black square. This second square just stays
 there. So, what happened? Did the square moving from left to right keep
 moving? Or did it stop and then another square suddenly appeared and moved
 from left to right?

 *Here is a simplified version of how we solve case study 1:
 *The important hypotheses to consider are:
 1) the square from frame 1 of the video that has a very close position to
 the square from frame 2 should be matched (we hypothesize that they are the
 same square and that any difference in position is motion).  So, what
 happens is that in each two frames of the video, we only match one square.
 The other square goes unmatched.
 2) We do the same thing as in hypothesis #1, but this time we also match
 the remaining squares and hypothesize motion as follows: the first square
 jumps over the second square from left to right. We hypothesize that this
 happens over and over in each frame of the video. Square 2 stops and square
 1 jumps over it over and over again.
 3) We hypothesize that both squares move to the right in unison. This is
 the correct hypothesis.

 So, why should we prefer

Re: [agi] Re: Huge Progress on the Core of AGI

2010-06-29 Thread Matt Mahoney
 Right. But Occam's Razor is not complete. It says simpler is better, but 1) 
 this only applies when two hypotheses have the same explanatory power and 2) 
 what defines simpler? 

A hypothesis is a program that outputs the observed data. It explains the 
data if its output matches what is observed. The simpler hypothesis is the 
shorter program, measured in bits.

The language used to describe the data can be any Turing complete programming 
language (C, Lisp, etc) or any natural language such as English. It does not 
matter much which language you use, because for any two languages there is a 
fixed length procedure, described in either of the languages, independent of 
the data, that translates descriptions in one language to the other.

 For example, the simplest hypothesis for all visual interpretation is that 
 everything in the first image is gone in the second image, and everything in 
 the second image is a new object. Simple. Done. Solved :) right? 

The hypothesis is not the simplest. The program that outputs the two frames as 
if independent cannot be smaller than the two frames compressed independently. 
The program could be made smaller if it only described how the second frame is 
different than the first. It would be more likely to correctly predict the 
third frame if it continued to run and described how it would be different than 
the second frame.

 I don't think much progress has been made in this area, but I'd like to know 
 what other people have done and any successes they've had.

Kolmogorov proved that the solution is not computable. Given a hypothesis (a 
description of the observed data, or a program that outputs the observed data), 
there is no general procedure or test to determine whether a shorter (simpler, 
better) hypothesis exists. Proof: suppose there were. Then I could describe 
the first data set that cannot be described in less than a million bits even 
though I just did. (By first I mean the first data set encoded by a string 
from shortest to longest, breaking ties lexicographically).

That said, I believe the state of the art in both language and vision are based 
on hierarchical neural models, i.e. pattern recognition using learned weighted 
combinations of simpler patterns. I am more familiar with language. The top 
ranked programs can be found at http://mattmahoney.net/dc/text.html

 -- Matt Mahoney, matmaho...@yahoo.com





From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Tue, June 29, 2010 10:44:41 AM
Subject: Re: [agi] Re: Huge Progress on the Core of AGI

Thanks Matt,

Right. But Occam's Razor is not complete. It says simpler is better, but 1) 
this only applies when two hypotheses have the same explanatory power and 2) 
what defines simpler? 

So, maybe what I want to know from the state of the art in research is: 

1) how precisely do other people define simpler
and
2) More importantly, how do you compare competing explanations/hypotheses that 
have more or less explanatory power. Simpler does not apply unless you are 
comparing equally explanatory hypotheses. 

For example, the simplest hypothesis for all visual interpretation is that 
everything in the first image is gone in the second image, and everything in 
the second image is a new object. Simple. Done. Solved :) right? Well, clearly 
a more complicated explanation is warranted because a more complicated 
explanation is more *explanatory* and a better explanation. So, why is it 
better? Can it be defined as better in a precise way so that you can compare 
arbitrary hypotheses or explanations? That is what I'm trying to learn about. I 
don't think much progress has been made in this area, but I'd like to know what 
other people have done and any successes they've had.

Dave



On Tue, Jun 29, 2010 at 10:29 AM, Matt Mahoney matmaho...@yahoo.com wrote:

David Jones wrote:
 If anyone has any knowledge of or references to the state of the art in 
 explanation-based reasoning, can you send me keywords or links? 


The simplest explanation of the past is the best predictor of the future.
http://en.wikipedia.org/wiki/Occam's_razor
http://www.scholarpedia.org/article/Algorithmic_probability

 -- Matt Mahoney, matmaho...@yahoo.com






From: David Jones davidher...@gmail.com

To: agi agi@v2.listbox.com
Sent: Tue, June 29, 2010 9:05:45 AM
Subject: [agi] Re: Huge Progress on the Core of AGI


If anyone has any knowledge of or references to the state of the art in 
explanation-based reasoning, can you send me keywords or links? I've read some 
through google, but I'm not really satisfied with anything I've found. 

Thanks,

Dave


On Sun, Jun 27, 2010 at 1:31 AM, David Jones davidher...@gmail.com wrote:

A method for comparing hypotheses in explanatory-based reasoning: 

We prefer the hypothesis or explanation that *expects* more observations. If 
both explanations expect the same observations, then the simpler of the two

Re: [agi] Re: Huge Progress on the Core of AGI

2010-06-29 Thread Matt Mahoney
David Jones wrote:
 I really don't think this is the right way to calculate simplicity. 

I will give you an example, because examples are more convincing than proofs.

Suppose you perform a sequence of experiments whose outcome can either be 0 or 
1. In the first 10 trials you observe 00. What do you expect to observe 
in the next trial?

Hypothesis 1: the outcome is always 0.
Hypothesis 2: the outcome is 0 for the first 10 trials and 1 thereafter.

Hypothesis 1 is shorter than 2, so it is more likely to be correct.

If I describe the two hypotheses in French or Chinese, then 1 is still shorter 
than 2.

If I describe the two hypotheses in C, then 1 is shorter than 2.

  void hypothesis_1() {
while (1) printf(0);
  }

  void hypothesis_2() {
int i;
for (i=0; i10; ++i) printf(0);
while (1) printf(1);
  }

If I translate these programs into Perl or Lisp or x86 assembler, then 1 will 
still be shorter than 2.

I realize there might be smaller equivalent programs. But I think you could 
find a smaller program equivalent to hypothesis_1 than hypothesis_2.

I realize there are other hypotheses than 1 or 2. But I think that the smallest 
one you can find that outputs eleven bits of which the first ten are zeros will 
be a program that outputs another zero.

I realize that you could rewrite 1 so that it is longer than 2. But it is the 
shortest version that counts. More specifically consider all programs in which 
the first 10 outputs are 0. Then weight each program by 2^-length. So the 
shortest programs dominate.

I realize you could make up a language where the shortest encoding of 
hypothesis 2 is shorter than 1. You could do this for any pair of hypotheses. 
However, I think if you stick to simple languages (and I realize this is a 
circular definition), then 1 will usually be shorter than 2.

 -- Matt Mahoney, matmaho...@yahoo.com





From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Tue, June 29, 2010 1:31:01 PM
Subject: Re: [agi] Re: Huge Progress on the Core of AGI




On Tue, Jun 29, 2010 at 11:26 AM, Matt Mahoney matmaho...@yahoo.com wrote:

 Right. But Occam's Razor is not complete. It says simpler is better, but 1) 
 this only applies when two hypotheses have the same explanatory power and 2) 
 what defines simpler? 


A hypothesis is a program that outputs the observed data. It explains the 
data if its output matches what is observed. The simpler hypothesis is the 
shorter program, measured in bits.

I can't be confident that bits is the right way to do it. I suspect bits is an 
approximation of a more accurate method. I also suspect that you can write a 
more complex explanation program with the same number of bits. So, there are 
some flaws with this approach. It is an interesting idea to consider though. 
 



The language used to describe the data can be any Turing complete programming 
language (C, Lisp, etc) or any natural language such as English. It does not 
matter much which language you use, because for any two languages there is a 
fixed length procedure, described in either of the languages, independent of 
the data, that translates descriptions in
 one language to the other.

Hypotheses don't have to be written in actual computer code and probably 
shouldn't be because hypotheses are not really meant to be run per say. And 
outputs are not necessarily the right way to put it either. Outputs imply 
prediction. And as mike has often pointed out, things cannot be precisely 
predicted. We can, however, determine whether a particular observation fits 
expectations, rather than equals some prediction. There may be multiple 
possible outcomes that we expect and which would be consistent with a 
hypothesis, which is why actual prediction should not be used.


 For example, the simplest hypothesis for all visual interpretation is that 
 everything in the first image is gone in the second image, and everything in 
 the second image is a new object. Simple. Done. Solved :) right? 


The hypothesis is not the simplest. The program that outputs the two frames as 
if independent cannot be smaller than the two frames compressed independently. 
The program could be made smaller if it only described how the second frame is 
different than the first. It would be more likely to correctly predict the 
third frame if it continued to run and described how it would be different 
than the second frame.

I really don't think this is the right way to calculate simplicity. 
 



 I don't think much progress has been made in this area, but I'd like to know 
 what other people have done and any successes they've had.


Kolmogorov proved that the solution is not
 computable. Given a hypothesis (a description of the observed data, or a 
 program that outputs the observed data), there is no general procedure or 
 test to determine whether a shorter (simpler, better) hypothesis exists. 
 Proof: suppose there were. Then I could describe

Re: [agi] Re: Huge Progress on the Core of AGI

2010-06-29 Thread David Jones
Such an example is no where near sufficient to accept the assertion that
program size is the right way to define simplicity of a hypothesis.

Here is a counter example. It requires a slightly more complex example
because all zeros doesn't leave any room for alternative hypotheses.

Here is the sequence: 10, 21, 32

void hypothesis_1() {
   int ten = 10;
   int counter = 0;
while (1)
{
   print(ten+counter)
   ten = ten + 10;
   counter = counter + 1;
}
  }

void hypothesis_2() {
while (1)
   print(10 21 32)
   }


Hypothesis 2 is simpler, yet clearly wrong. These examples don't really show
anything.

Dave

On Tue, Jun 29, 2010 at 3:15 PM, Matt Mahoney matmaho...@yahoo.com wrote:

 David Jones wrote:
  I really don't think this is the right way to calculate simplicity.

 I will give you an example, because examples are more convincing than
 proofs.

 Suppose you perform a sequence of experiments whose outcome can either be 0
 or 1. In the first 10 trials you observe 00. What do you expect to
 observe in the next trial?

 Hypothesis 1: the outcome is always 0.
 Hypothesis 2: the outcome is 0 for the first 10 trials and 1 thereafter.

 Hypothesis 1 is shorter than 2, so it is more likely to be correct.

 If I describe the two hypotheses in French or Chinese, then 1 is still
 shorter than 2.

 If I describe the two hypotheses in C, then 1 is shorter than 2.

   void hypothesis_1() {
 while (1) printf(0);
   }

   void hypothesis_2() {
 int i;
 for (i=0; i10; ++i) printf(0);
 while (1) printf(1);
   }

 If I translate these programs into Perl or Lisp or x86 assembler, then 1
 will still be shorter than 2.

 I realize there might be smaller equivalent programs. But I think you could
 find a smaller program equivalent to hypothesis_1 than hypothesis_2.

 I realize there are other hypotheses than 1 or 2. But I think that the
 smallest one you can find that outputs eleven bits of which the first ten
 are zeros will be a program that outputs another zero.

 I realize that you could rewrite 1 so that it is longer than 2. But it is
 the shortest version that counts. More specifically consider all programs in
 which the first 10 outputs are 0. Then weight each program by 2^-length. So
 the shortest programs dominate.

 I realize you could make up a language where the shortest encoding of
 hypothesis 2 is shorter than 1. You could do this for any pair of
 hypotheses. However, I think if you stick to simple languages (and I
 realize this is a circular definition), then 1 will usually be shorter than
 2.


 -- Matt Mahoney, matmaho...@yahoo.com


 --
 *From:* David Jones davidher...@gmail.com
 *To:* agi agi@v2.listbox.com
 *Sent:* Tue, June 29, 2010 1:31:01 PM

 *Subject:* Re: [agi] Re: Huge Progress on the Core of AGI



 On Tue, Jun 29, 2010 at 11:26 AM, Matt Mahoney matmaho...@yahoo.comwrote:

  Right. But Occam's Razor is not complete. It says simpler is better, but
 1) this only applies when two hypotheses have the same explanatory power and
 2) what defines simpler?

 A hypothesis is a program that outputs the observed data. It explains
 the data if its output matches what is observed. The simpler hypothesis is
 the shorter program, measured in bits.


 I can't be confident that bits is the right way to do it. I suspect bits is
 an approximation of a more accurate method. I also suspect that you can
 write a more complex explanation program with the same number of bits. So,
 there are some flaws with this approach. It is an interesting idea to
 consider though.



 The language used to describe the data can be any Turing complete
 programming language (C, Lisp, etc) or any natural language such as
 English. It does not matter much which language you use, because for any two
 languages there is a fixed length procedure, described in either of the
 languages, independent of the data, that translates descriptions in one
 language to the other.


 Hypotheses don't have to be written in actual computer code and probably
 shouldn't be because hypotheses are not really meant to be run per say.
 And outputs are not necessarily the right way to put it either. Outputs
 imply prediction. And as mike has often pointed out, things cannot be
 precisely predicted. We can, however, determine whether a particular
 observation fits expectations, rather than equals some prediction. There may
 be multiple possible outcomes that we expect and which would be consistent
 with a hypothesis, which is why actual prediction should not be used.

  For example, the simplest hypothesis for all visual interpretation is
 that everything in the first image is gone in the second image, and
 everything in the second image is a new object. Simple. Done. Solved :)
 right?

 The hypothesis is not the simplest. The program that outputs the two
 frames as if independent cannot be smaller than the two frames compressed
 independently. The program could be made smaller

Re: [agi] Re: Huge Progress on the Core of AGI

2010-06-29 Thread Matt Mahoney
You can always find languages that favor either hypothesis. Suppose that you 
want to predict the sequence 10, 21, 32, ? and we write our hypothesis as a 
function that takes the trial number (0, 1, 2, 3...) and returns the outcome. 
The sequence 10, 21, 32, 43, 54... would be coded:

int hypothesis_1(int trial) {
  return trial*11+10;
}

The sequence 10, 21, 32, 10, 21, 32... would be coded

int hypothesis_2(int trial) {
  return trial%3*11+10;
}

which is longer and therefore less likely.

Here is another example: predict the sequence 0, 1, 4, 9, 16, 25, 36, 49, ?

Can you find a program shorter than this that doesn't predict 64?

int hypothesis_1(int trial) {
  return trial*trial;
}

 -- Matt Mahoney, matmaho...@yahoo.com





From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Tue, June 29, 2010 3:48:01 PM
Subject: Re: [agi] Re: Huge Progress on the Core of AGI

Such an example is no where near sufficient to accept the assertion that 
program size is the right way to define simplicity of a hypothesis.

Here is a counter example. It requires a slightly more complex example because 
all zeros doesn't leave any room for alternative hypotheses.

Here is the sequence: 10, 21, 32

void hypothesis_1() {
   int ten = 10;
   int counter = 0;

while (1)
{
   print(ten+counter)
   ten = ten + 10;
   counter = counter + 1;
}

  }

void hypothesis_2() {

while (1)
   print(10 21 32)

  }

Hypothesis 2 is simpler, yet clearly wrong. These examples don't really show 
anything.

Dave


On Tue, Jun 29, 2010 at 3:15 PM, Matt Mahoney matmaho...@yahoo.com wrote:

David Jones wrote:
 I really don't think this is the right way to calculate simplicity. 


I will give you an example, because examples are more convincing than proofs.


Suppose you perform a sequence of experiments whose outcome can either be 0 or 
1. In the first 10 trials you observe 00. What do you expect to 
observe in the next trial?


Hypothesis 1: the outcome is always 0.
Hypothesis 2: the outcome is 0 for the first 10 trials and 1 thereafter.


Hypothesis 1 is shorter than 2, so it is more likely to be correct.


If I describe
 the two hypotheses in French or Chinese, then 1 is still shorter than 2.


If I describe the two hypotheses in C, then 1 is shorter than 2.


  void hypothesis_1() {
while (1) printf(0);
  }


  void hypothesis_2() {
int i;
for (i=0; i10; ++i) printf(0);
while (1) printf(1);
  }


If I translate these programs into Perl or Lisp or x86 assembler, then 1 will 
still be shorter than 2.


I realize there might be smaller equivalent programs. But I think you could 
find a smaller program equivalent to hypothesis_1 than hypothesis_2.


I realize there are other hypotheses than 1 or 2. But I think that the 
smallest one you can find that outputs
 eleven bits of which the first ten are zeros will be a program that outputs 
 another zero.


I realize that you could rewrite 1 so that it is longer than 2. But it is the 
shortest version that counts. More specifically consider all programs in which 
the first 10 outputs are 0. Then weight each program by 2^-length. So the 
shortest programs dominate.


I realize you could make up a language where the shortest encoding of 
hypothesis 2 is shorter than 1. You could do this for any pair of hypotheses. 
However, I think if you stick to simple languages (and I realize this is a 
circular definition), then 1 will usually be shorter than 2.

 -- Matt Mahoney, matmaho...@yahoo.com






From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Tue, June 29, 2010 1:31:01 PM

Subject: Re: [agi] Re: Huge Progress on the Core of AGI





On Tue, Jun 29, 2010 at 11:26 AM, Matt Mahoney matmaho...@yahoo.com wrote:


 Right. But Occam's Razor is not complete. It says simpler is better, but 1) 
 this only applies when two hypotheses have the same explanatory power and 
 2) what defines simpler? 


A hypothesis is a program that outputs the observed data. It explains the 
data if its output matches what is observed. The simpler hypothesis is the 
shorter program, measured in bits.

I can't be confident that bits is the right way to do it. I suspect bits is an 
approximation of a more accurate method. I also suspect that you can write a 
more complex explanation program with the same number of bits. So, there are 
some flaws with this approach. It is an interesting idea to consider though. 

 



The language used to describe the data can be any Turing complete programming 
language (C, Lisp, etc) or any natural language such as English. It does not 
matter much which language you use, because for any two languages there is a 
fixed length procedure, described in either of the languages, independent of 
the data, that translates descriptions in
 one language to the other.

Hypotheses don't have to be written in actual computer code and probably 
shouldn't

Re: [agi] Re: Huge Progress on the Core of AGI

2010-06-29 Thread Abram Demski
David,

What Matt is trying to explain is all right, but I think a better way of
answering your question would be to invoke the mighty mysterious Bayes' Law.

I had an epiphany similar to yours (the one that started this thread) about
5 years ago now. At the time I did not know that it had all been done
before. I think many people feel this way about MDL. Looking into the MDL
(minimum description length) literature would be a good starting point.

In brief, the answer to your question is: we formalize the description
length heuristic by assigning lower probabilities to longer hypotheses, and
we apply Bayes law to update these probabilities given the data we observe.
This updating captures the idea that we should reward theories which
explain/expect more of the observations; it also provides a natural way to
balance simplicity vs explanatory power, so that we can compare any two
theories with a single scoring mechanism. Bayes Law automatically places the
right amount of pressure to avoid overly elegant explanations which don't
get much right, and to avoid overly complex explanations which fit the
observations perfectly but which probably won't generalize to new data.

Bayes' Law and MDL have strong connections, though sometimes they part ways.
There are deep theorems here. For me it's good enough to note that if we're
using a maximally efficient code for our knowledge representation, they are
equivalent. (This in itself involves some deep math; I can explain if you're
interested, though I believe I've already posted a writeup to this list in
the past.) Bayesian updating is essentially equivalent to scoring hypotheses
as: hypothesis size + size of data's description using hypothesis. Lower
scores are better (as the score is approximately -log(probability)).

If you go down this path, you will eventually come to understand (and,
probably, accept) algorithmic information theory. Matt may be tring to force
it on you too soon. :)

--Abram

On Tue, Jun 29, 2010 at 10:44 AM, David Jones davidher...@gmail.com wrote:

 Thanks Matt,

 Right. But Occam's Razor is not complete. It says simpler is better, but 1)
 this only applies when two hypotheses have the same explanatory power and 2)
 what defines simpler?

 So, maybe what I want to know from the state of the art in research is:

 1) how precisely do other people define simpler
 and
 2) More importantly, how do you compare competing explanations/hypotheses
 that have more or less explanatory power. Simpler does not apply unless you
 are comparing equally explanatory hypotheses.

 For example, the simplest hypothesis for all visual interpretation is that
 everything in the first image is gone in the second image, and everything in
 the second image is a new object. Simple. Done. Solved :) right? Well,
 clearly a more complicated explanation is warranted because a more
 complicated explanation is more *explanatory* and a better explanation. So,
 why is it better? Can it be defined as better in a precise way so that you
 can compare arbitrary hypotheses or explanations? That is what I'm trying to
 learn about. I don't think much progress has been made in this area, but I'd
 like to know what other people have done and any successes they've had.

 Dave


 On Tue, Jun 29, 2010 at 10:29 AM, Matt Mahoney matmaho...@yahoo.comwrote:

 David Jones wrote:
  If anyone has any knowledge of or references to the state of the art in
 explanation-based reasoning, can you send me keywords or links?

 The simplest explanation of the past is the best predictor of the future.
 http://en.wikipedia.org/wiki/Occam's_razorhttp://en.wikipedia.org/wiki/Occam%27s_razor
  http://en.wikipedia.org/wiki/Occam%27s_razor
 http://www.scholarpedia.org/article/Algorithmic_probability
  http://www.scholarpedia.org/article/Algorithmic_probability

 -- Matt Mahoney, matmaho...@yahoo.com


 --
 *From:* David Jones davidher...@gmail.com

 *To:* agi agi@v2.listbox.com
 *Sent:* Tue, June 29, 2010 9:05:45 AM
 *Subject:* [agi] Re: Huge Progress on the Core of AGI

 If anyone has any knowledge of or references to the state of the art in
 explanation-based reasoning, can you send me keywords or links? I've read
 some through google, but I'm not really satisfied with anything I've found.

 Thanks,

 Dave

 On Sun, Jun 27, 2010 at 1:31 AM, David Jones davidher...@gmail.comwrote:

 A method for comparing hypotheses in explanatory-based reasoning: *

 We prefer the hypothesis or explanation that ***expects* more
 observations. If both explanations expect the same observations, then the
 simpler of the two is preferred (because the unnecessary terms of the more
 complicated explanation do not add to the predictive power).*

 *Why are expected events so important?* They are a measure of 1)
 explanatory power and 2) predictive power. The more predictive and the more
 explanatory a hypothesis is, the more likely the hypothesis is when compared
 to a competing hypothesis.

 Here are two case